Snapshot through Rest API

SLM

Member
Webcall server 5.2.1847 on AWS EC2

A POST call to [server]:8444/rest-api/stream/snapshot results in 200 OK with no snapshot data.

In the log file we can see a SNAPSHOT_COMPLETE with image data included but this is after the rest api request ended. For example:
20-12-2023 10:37:36 [Method:snapshot Data]
20-12-2023 10:37:37 [Method:StreamStatusEvent Data] ("status":"SNAPSHOT_COMPLETE", "info":[imagedata])

This is different from what happened on previous versions / servers.

Also, please note that links are broken on your website to the release notes:
 

Max

Administrator
Staff member
Good day.
A POST call to [server]:8444/rest-api/stream/snapshot results in 200 OK with no snapshot data.
We can't reproduce this on our test server with build 5.2.1847 (default settings)
You can try to increase a maximum snapshot taking duration: Configuration
Code:
snapshot_taking_interval_ms=10000
snapshot_taking_attempts=100
Also, please note that links are broken on your website to the release notes:
We already fixed this.
 

SLM

Member
Unfortunately I spoke too soon. It appears the error is caused by 0 bytes free on the drive which is caused by a flooding of the flashphoner.log with these kinds of entries:

Code:
12:35:57,014 WARN         StunUdpSocket - STUN-UDP-pool-43-thread-208 [id: 0x43a9e0a1, /172.31.12.52:31986] Failed to send or receive message
java.lang.NullPointerException: message
    at org.jboss.netty.channel.DownstreamMessageEvent.<init>(Unknown Source)
    at org.jboss.netty.channel.Channels.write(Unknown Source)
    at com.flashphoner.ice.A.J.A(Unknown Source)
    at com.flashphoner.ice.A.J.handleDownstream(Unknown Source)
    at org.jboss.netty.channel.Channels.write(Unknown Source)
    at com.flashphoner.ice.D.B.A.send(Unknown Source)
    at com.flashphoner.D.E.M.A(Unknown Source)
    at com.flashphoner.D.E.M$3.A(Unknown Source)
    at com.flashphoner.D.A.A.A(Unknown Source)
    at com.flashphoner.D.E.M.B(Unknown Source)
    at com.flashphoner.D.E.O.A(Unknown Source)
    at com.flashphoner.D.E.D.A(Unknown Source)
    at com.flashphoner.D.E.D.A(Unknown Source)
    at com.flashphoner.D.E.D.A(Unknown Source)
    at com.flashphoner.D.E.D.A(Unknown Source)
    at com.flashphoner.D.E.M.A(Unknown Source)
    at com.flashphoner.D.E.B.C.G.A(Unknown Source)
    at com.flashphoner.D.E.B.A$_A.A(Unknown Source)
    at com.flashphoner.D.E.M.C(Unknown Source)
    at com.flashphoner.D.E.M.dataPacketReceived(Unknown Source)
    at com.flashphoner.D.E.G.D(Unknown Source)
    at com.flashphoner.D.E.G.dataPacketReceived(Unknown Source)
    at com.flashphoner.ice.D.B.A(Unknown Source)
    at com.flashphoner.ice.D.B.A.A(Unknown Source)
    at com.flashphoner.ice.D.B.A$_A.messageReceived(Unknown Source)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Unknown Source)
    at com.flashphoner.ice.A.H.handleUpstream(Unknown Source)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Unknown Source)
    at org.jboss.netty.channel.socket.nio.NioDatagramWorker.read(Unknown Source)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(Unknown Source)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(Unknown Source)
    at org.jboss.netty.channel.socket.nio.NioDatagramWorker.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
12:35:57,017 ERROR           MuxEncoder - STUN-UDP-pool-43-thread-208 Failed to encode compound RTCP packet to send.
java.lang.ArrayIndexOutOfBoundsException
12:35:57,023 ERROR           MuxEncoder - STUN-UDP-pool-43-thread-208 Packets [ReceiverReportPacket{senderSsrc=2032057515, receptionReports=[ReceptionReport{ssrc=3129457433, fractionLost=0, cumulativeNumberOfPacketsLost=30
With this rate the free space on the server drive will be at 100% within a day
 

SLM

Member
This is highly unexpected and has never happened before. It now happens on 2 different AWS servers. One with the AWS marketplace AMI with an hourly license and one with a regular EC2 image and FF manually installed. Both are on the latest FF version. I will try the
ice_tcp_transport = true setting on one of these servers.
 

SLM

Member
With this setting I get the following error:
WebRTC: ICE failed, add a STUN server and see about:webrtc for more details
 

Max

Administrator
Staff member
Regarding to TCP. It fails because TCP ports are closed on Amazon AWS security.

1. Check AWS instance security group settings.
2. Add TCP ports range:
TCP 31000 - 33000
3. Make sure that media port range is matched to flashphoner.properties settings:
media_port_from = 31000
media_port_to =33000

1703260566673.png
 

SLM

Member
This is the standard setting:
media_port_from =31001
media_port_to =32000

So besides adding the TCP rule to AWS, I have to edit this configuration setting in flashphonen to set 31001 to 31000 and 32000 to 33000 ?
 

Max

Administrator
Staff member
No. You have to change just AWS rules.
The AWS rules must match current settings.

For example.
If current settings are
media_port_from =31001
media_port_to =32000

AWS rules should be:
TCP 31001-32000
UDP 31001-32000
 

SLM

Member
This seems to be working on one of the servers (with the hourly AWS license). This TCP setting was missing in the predefined rules list.

However, this TCP setting was/is present on the other (manually installed) EC2 server and besides the Flashphoner WCS update nothing changed there and it was never having any issues with overflowing log files or the huge amount of errors that caused this. So what has changed? Is it in the WCS update? The logs were eating up 70GB of space in no time.
 

Max

Administrator
Staff member
Hello

Please upload latest huge logs if you didn't clean all.
/usr/local/FlashphonerWebCallServer/logs/server_logs/flashphoner.log
This log is rotated hourly. We would check few hours where logs are too big.
You can send download link via Report Form

What was your previous version? Before 5.2.1847?
Possibly it looks like a regression related to RTP packets sending.

Another way you can monitor exception statistics:

Output example:
Code:
-----Errors info-----
java.lang.IndexOutOfBoundsException=372
com.flashphoner.sdk.softphone.exception.SoftphoneCallException=12
org.apache.http.NoHttpResponseException=1
java.net.BindException=13
com.flashphoner.server.commons.rmi.operations.exception.ClientNotFoundException=1
javax.crypto.AEADBadTagException=100
com.flashphoner.media.J.B=3
java.lang.NullPointerException=497
com.flashphoner.media.J.C=1
java.net.SocketException=3
org.codehaus.jackson.map.JsonMappingException=2
com.flashphoner.sdk.softphone.exception.SoftphoneException=11
com.flashphoner.server.client.SubscribeStreamsLimitException=276
java.net.ConnectException=2825
java.text.ParseException=133
java.util.NoSuchElementException=5
javax.net.ssl.SSLProtocolException=18
java.nio.channels.ClosedChannelException=306
javax.net.ssl.SSLHandshakeException=651
java.io.IOException=832
java.lang.NumberFormatException=3
java.lang.ArrayIndexOutOfBoundsException=325
java.lang.IllegalArgumentException=102594
com.flashphoner.rest.server.exception.MalformedRtmpUrlException=34
java.lang.Exception=42
java.io.FileNotFoundException=2
java.lang.reflect.InvocationTargetException=903262
org.jboss.netty.handler.codec.frame.TooLongFrameException=12
javax.net.ssl.SSLException=726
org.codehaus.jackson.JsonParseException=2135
com.flashphoner.sip.D.A.C=9
com.flashphoner.server.remote.J=4111
com.flashphoner.server.commons.rmi.operations.exception.UrlConflictException=113
java.net.NoRouteToHostException=1191

If error rate is too high. It would look like a regression. So please help us gather error logs and error statistics. Then we will be able to investigate this.
 

SLM

Member
Hello

Please upload latest huge logs if you didn't clean all.
/usr/local/FlashphonerWebCallServer/loLgs/server_logs/flashphoner.log
This log is rotated hourly. We would chleck few hours where logs are too big.
You can send download link via Report Form
I have uploaded an old log file via the form. Also included a link to a client log of which there are also a lot:
bad-request-HLS-[id]

which is weird because we don't use HLS.

What was your previous version? Before 5.2.1847?
Possibly it looks like a regression related to RTP packets sending.
I'm not sure but on the manually installed server I think it was 5.2.1744 and on the Flashphoner AMI it was 5.2.1825

Another way you can monitor exception statistics:
This is from the manually installed server which has received no traffic since Dec 22nd.
Code:
-----Errors info-----
com.flashphoner.rest.server.exception.NoSpaceLeftException=14
java.nio.file.NoSuchFileException=1
com.flashphoner.media.rtp.D.D.A.E=23
java.nio.channels.ClosedChannelException=144
javax.net.ssl.SSLHandshakeException=26
java.io.IOException=146
java.lang.ArrayIndexOutOfBoundsException=1892098
java.lang.Exception=6
java.lang.NullPointerException=945743
java.io.FileNotFoundException=57636048
java.lang.reflect.InvocationTargetException=1

Edit:/
And this is from the server with the AMI which is receiving all of the traffic now:

Code:
-----Errors info-----
java.lang.IndexOutOfBoundsException=1
java.net.SocketException=190
com.flashphoner.media.rtp.D.D.A.E=61
javax.net.ssl.SSLHandshakeException=25
java.nio.channels.ClosedChannelException=116
java.io.IOException=1
java.lang.ArrayIndexOutOfBoundsException=20332
java.lang.NullPointerException=9965
java.lang.Exception=1
java.lang.reflect.InvocationTargetException=5
 

Max

Administrator
Staff member
I have uploaded an old log file via the form. Also included a link to a client log of which there are also a lot:
bad-request-HLS-[id]

which is weird because we don't use HLS.
Someone tries to DDoS the server by HLS requests. Maybe there is a vulnerability in some media servers they try to exploit. But WCS is proof, so you can ignore those logs or disable HLS at all if you don't use it:
Code:
hls_server_enabled=false
About NPE in logs: we raised the ticket WCS-4014 to investigate. But we need the server settings to check (a full report collected by the script: Getting logs with report.sh script)
Also we recommend to update WCS to the latest build 5.2.1852.
 

SLM

Member
Someone tries to DDoS the server by HLS requests. Maybe there is a vulnerability in some media servers they try to exploit. But WCS is proof, so you can ignore those logs or disable HLS at all if you don't use it:
Code:
hls_server_enabled=false
Can we also close TCP ports 8445 and 8082 ? I don't think it's a Dos attack however because after we switch the traffic from our sites to the other server the logs do not grow that much.

About NPE in logs: we raised the ticket WCS-4014 to investigate. But we need the server settings to check (a full report collected by the script: Getting logs with report.sh script)
Also we recommend to update WCS to the latest build 5.2.1852.
I will collect this tomorrow
 

SLM

Member
I have sent the reports from both servers to support@flashphoner.com. Please note that the log I sent yesterday via the form was an old log file which is not present at that server at the time of the report and that its WCS version had also been updated after cutting all the traffic from that server.
 

Max

Administrator
Staff member
It seems you have got reports with TCP enabled.
ice_tcp_transport = true
in flashphoner.properties

We don't see any issues with TCP enabled.
Could you comment out this setting and prepare report for UDP configuration.
 
Top