Flashphoner becomes very slow

neogeo

Member
Flashphoner becomes very slow after 100 connections.
We have 2-3 users streaming voice and screen sharing. When we start reaching 100 connected users things start to become very slow. By slow I mean that it takes 20 seconds from the time the users clicks play until he starts receiving video/audio frames, with no load it usually takes 2-3 seconds.

We use the REST API for authorization and I have noticed that it takes too long to pass from one state to another:

Here is an example but not sure I can make my point clear:
grep -B7 747d4190-901c-11e8-9772-d119da6f5b7c flashphoner_manager.log | grep " OBJECT"

14:26:10,238 INFO agerRemoteRmiService - RMI TCP Connection(208)-127.0.0.1 SEND REST OBJECT ==>
14:26:12,866 INFO agerRemoteRmiService - RMI TCP Connection(208)-127.0.0.1 RECEIVED REST OBJECT <==
14:26:20,346 INFO agerRemoteRmiService - RMI TCP Connection(222)-127.0.0.1 SEND REST OBJECT ==>
14:26:23,113 INFO agerRemoteRmiService - RMI TCP Connection(222)-127.0.0.1 RECEIVED REST OBJECT <==

The first two lines are for /stream/flashphoner/playStream and then the other two are for /flashphoner/StreamStatusEvent
I took around 8 seconds between the two states when we have 100 users where it takes less than 1 second when there is no load.

I am not sure where else to search.

The version is: 5.0.3471
 

Max

Administrator
Staff member
Hello
1. How much physical memory does the server have?
2. Please share your wcs-core.properties and wcs-manager.properties configs.
3. Please share your gc-core.log and gc-manager.log - latest Garbage Collector log files.
4. How much physical CPUs does the server have? What CPU metrics do you see when it works slow.
You can also monitor server using JMX as described in the docs: https://docs.flashphoner.com/display/WCS5EN/Connecting+from+Visual+VM
Here you can see memory, CPU and GC activity in real-time.
 

neogeo

Member
Here is the zip with the files: https://files.fm/u/vk9zgd44

I have just checked the transition time from /playStream to /StreamStatusEvent when the server does not have any load:

/playStream
11:04:36,729 INFO agerRemoteRmiService - RMI TCP Connection(2375)-127.0.0.1 SEND REST OBJECT ==>
11:04:36,735 INFO agerRemoteRmiService - RMI TCP Connection(2375)-127.0.0.1 RECEIVED REST OBJECT <==
/StreamStatusEvent
11:04:37,661 INFO agerRemoteRmiService - RMI TCP Connection(2373)-127.0.0.1 SEND REST OBJECT ==>
11:04:37,667 INFO agerRemoteRmiService - RMI TCP Connection(2373)-127.0.0.1 RECEIVED REST OBJECT <==

As you can see it takes 1sec!!! I have checked the same thing on another demo server and I see it takes only 4ms. So there is something here but I am not sure where to check.


The server has 24 cores and 42gb ram
The CPU usage is 12% at max during the busy hours:
upload_2018-7-27_14-11-31.png
 

Max

Administrator
Staff member
As you can see it takes 1sec!!! I have checked the same thing on another demo server and I see it takes only 4ms. So there is something here but I am not sure where to check.
If you are using REST Hooks, obviously it is a response time from your Web back-end.
Try to make the same authorization REST request using for example curl
Curl example:
Code:
curl -i -X POST -H "Content-Type:application/json" http://localhost:8888/demo-rest-jersey-spring/podcasts/ -d '{"title":"- The Naked Scientists Podcast - Stripping Down Science-new-title2","linkOnPodcastpedia":"https://github.com/Codingpedia/podcastpedia/podcasts/792/-The-Naked-Scientists-Podcast-Stripping-Down-Science","description":"The Naked Scientists flagship science show brings you a lighthearted look at the latest scientific breakthroughs, interviews with the world top scientists, answers to your science questions and science experiments to try at home."}'
So you can measure speed of REST/HTTP and fix it if your backend has a slow response.
 

neogeo

Member
I wave switched back to localhost to cross out the remote server latency.

Check this out:

At 11:04:36,735 it receives the reply for the playStream event and almost after 1 second it triggers the StreamStatusEvent at 11:04:37,661. It feels like an internal Flashhponer queue has a lag or something.


11:04:36,729 INFO agerRemoteRmiService - RMI TCP Connection(2375)-127.0.0.1 SEND REST OBJECT ==>
URL:http://localhost:9091/EchoApp/playStream
OBJECT:
{
"nodeId" : "SUUAE6M7ws9Ir4tzlgwmxuZDmvurfsuW@163.182.168.165",
"appKey" : "a5XME7BFjh4Jn20ivJoHBKbngJ4Axh",
"sessionId" : "/79.129.115.34:2821/163.182.168.165:443",
"mediaSessionId" : "fac551b0-918a-11e8-b4a4-3784e1d2c90d",
--
11:04:36,735 INFO agerRemoteRmiService - RMI TCP Connection(2375)-127.0.0.1 RECEIVED REST OBJECT <==
URL:http://localhost:9091/EchoApp/playStream
OBJECT:
{
"nodeId" : "SUUAE6M7ws9Ir4tzlgwmxuZDmvurfsuW@163.182.168.165",
"appKey" : "a5XME7BFjh4Jn20ivJoHBKbngJ4Axh",
"sessionId" : "/79.129.115.34:2821/163.182.168.165:443",
"mediaSessionId" : "fac551b0-918a-11e8-b4a4-3784e1d2c90d",
--
11:04:37,661 INFO agerRemoteRmiService - RMI TCP Connection(2373)-127.0.0.1 SEND REST OBJECT ==>
URL:http://localhost:9091/EchoApp/StreamStatusEvent
OBJECT:
{
"nodeId" : "SUUAE6M7ws9Ir4tzlgwmxuZDmvurfsuW@163.182.168.165",
"appKey" : "a5XME7BFjh4Jn20ivJoHBKbngJ4Axh",
"sessionId" : "/79.129.115.34:2821/163.182.168.165:443",
"mediaSessionId" : "fac551b0-918a-11e8-b4a4-3784e1d2c90d",
--
11:04:37,667 INFO agerRemoteRmiService - RMI TCP Connection(2373)-127.0.0.1 RECEIVED REST OBJECT <==
URL:http://localhost:9091/EchoApp/StreamStatusEvent
OBJECT:
{
"nodeId" : "SUUAE6M7ws9Ir4tzlgwmxuZDmvurfsuW@163.182.168.165",
"appKey" : "a5XME7BFjh4Jn20ivJoHBKbngJ4Axh",
"sessionId" : "/79.129.115.34:2821/163.182.168.165:443",
"mediaSessionId" : "fac551b0-918a-11e8-b4a4-3784e1d2c90d",
--
 

Max

Administrator
Staff member
We have checked wcs-core.properties and wcs-manager.properties files
You have to increase heap memory.

wcs-core.properties
Code:
-Xmx16g -Xms16g
wcs-manager.properties
Code:
-Xmx4g -Xms4g
Most possible issue you have Full GC process which slow down the server.
First thing you can do is increasing of the heap.

Even better:
1. You can download latest 5.0 build
https://flashphoner.com/download50
2. Copy wcs-core.properties and wcs-manager.properties to your system.
3. Adjust heap in wcs-core.properties and wcs-manager.properties
Code:
-Xmx16g -Xms16g
Code:
-Xmx4g -Xms4g
 

neogeo

Member
I did it.

I tested again and it still takes 1 second to pass from playStream event to StreamStatusEvent.

12:37:16,635 INFO agerRemoteRmiService - RMI TCP Connection(5)-127.0.0.1 SEND REST OBJECT ==>
12:37:16,645 INFO agerRemoteRmiService - RMI TCP Connection(5)-127.0.0.1 RECEIVED REST OBJECT <==

12:37:17,585 INFO agerRemoteRmiService - RMI TCP Connection(5)-127.0.0.1 SEND REST OBJECT ==>
12:37:17,595 INFO agerRemoteRmiService - RMI TCP Connection(5)-127.0.0.1 RECEIVED REST OBJECT <==

I am not sure if the above is clear or related but it indicates that the problem still persists.
 

neogeo

Member
We have finally figured it out. The problem was not related with the Flashphoner, it was a faulty network switch. In fact, Flashhponer performed great under such conditions. Thanks for the support.
 

Stanley

Member
Sorry neogeo need to highjack your thread.

Max,
what is the best setting for GC in wcs-core.properties?
I notice our flashphoner will need to restart almost daily. Else the websocket connections will increases to 1000+ eventho we doesnt have that many concurrent users.

### JVM OPTIONS ###
-Xms16g -Xmx16g
#-Xcheck:jni

# Can be a better GC setting to avoid long pauses
-XX:+UseConcMarkSweepGC -XX:NewSize=1024m
#-XX:+CMSIncrementalMode
#-XX:+UseParNewGC"
 
Top