Hello Flashphoner
We've been experience an odd issue with the WebCall Server where, randomly, the main Java process gets stuck at 100% CPU.
We are unable to re-create this issue as it has only occurred on our Production Environment, has only happened 3 times in the past month, but occurs at random points without much explaination (this latest occurrance happening at 17:30 UTC when there is no user activity).
Given that it's on Production, and we cannot re-create it on a test server, we are fairly limited on the access and details that can be provided, but for now this is what we can share:
When: The latest issue occurred on 2025-07-24 between 17:30 UTC and 17:40 UTC:
This is when the CPU metrics jumped from 5% to 50% (which appears to coincide with the Java process running at 100% CPU)
Server Logs: According to the Logs, there's not too much to go on we believe but this log stood out:
It doesn't seem like much but it's the only log that's unique in the 7 days leading up to the error and occurred around that same time.
We are also getting these ERROR logs in the server, but we don't believe it's related due to that they occur frequently both before and after the CPU spike issue:
Client Logs: For these logs there was no activity around the time the CPU spiked:
AWS Properties:
We are using t3a.medium instance to run the media server on.
WCS Core Properties: These are the current running changes we make to the WCS Core Properties (everything else is based on settings in AMI 5.2.2105):
Further Info: We also don't believe we ever got the CPU issue on the past versions of the Media Server:
One thing we noticed is that while we've been using a script to change the wcs-core.properties, the `-XX:+UseConcMarkSweepGC` hasn't been getting set, and actually hasn't been since we started using WCS 5.2.2071. We don't believe it's the cause of the issue, but to test it out would require us deploying to Production servers and waiting for a month at least.
We're sorry we cannot share any more right now, due to the circumstances of the issue. If you need we can compile additional log data and send them to technical support, support@flashphoner.com.
Any help in understanding and fixing this issue would be greatly appreciated.
Cheers,
Taylor
We've been experience an odd issue with the WebCall Server where, randomly, the main Java process gets stuck at 100% CPU.
We are unable to re-create this issue as it has only occurred on our Production Environment, has only happened 3 times in the past month, but occurs at random points without much explaination (this latest occurrance happening at 17:30 UTC when there is no user activity).
Given that it's on Production, and we cannot re-create it on a test server, we are fairly limited on the access and details that can be provided, but for now this is what we can share:
When: The latest issue occurred on 2025-07-24 between 17:30 UTC and 17:40 UTC:
This is when the CPU metrics jumped from 5% to 50% (which appears to coincide with the Java process running at 100% CPU)
Server Logs: According to the Logs, there's not too much to go on we believe but this log stood out:
Code:
17:35:51,434 INFO bstractNioWorkerPool - SSL-WS-BOSS-pool-18-thread-1
Workers size 4
Cemetery size 1
index:id:state:dead_for_ms
0:SSL-WS-pool-19-thread-1405:RUNNABLE:14507
17:35:51,465 INFO Dumper - Thread-89086 Jstack execution..
17:35:52,154 INFO Dumper - Thread-89086 Jstack is done with exit code 0. Location:/usr/local/FlashphonerWebCallServer/logs/CemeteryDump.jstack
17:35:52,154 INFO Dumper - Thread-89086 Destroying process Process[pid=2928107, exitValue=0]
We are also getting these ERROR logs in the server, but we don't believe it's related due to that they occur frequently both before and after the CPU spike issue:
Code:
java.io.IOException: Broken pipe
at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:62)
at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:137)
...
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
- Last Log was on: 2025-07-24 15:36 UTC
- Next Log was on: 2025-07-24 19:36 UTC
AWS Properties:
We are using t3a.medium instance to run the media server on.
WCS Core Properties: These are the current running changes we make to the WCS Core Properties (everything else is based on settings in AMI 5.2.2105):
-Xms2g -Xmx2g
-XX:NewSize=256m
Further Info: We also don't believe we ever got the CPU issue on the past versions of the Media Server:
- WCS 5.2.2071 - Between start of October 2024 and start of June 2025 (this was on an old Flashphoner AMI, so there have been a few changes since)
- WCS 5.2.2247 - Between start of June 2025 and start of July 2025 (It was only up for about a month but there weren't any issues)
- WCS 5.2.2269 - Between start of July 2025 to now. (As we've mentioned, the CPU issue has occurred 3 times across 6 servers).
One thing we noticed is that while we've been using a script to change the wcs-core.properties, the `-XX:+UseConcMarkSweepGC` hasn't been getting set, and actually hasn't been since we started using WCS 5.2.2071. We don't believe it's the cause of the issue, but to test it out would require us deploying to Production servers and waiting for a month at least.
We're sorry we cannot share any more right now, due to the circumstances of the issue. If you need we can compile additional log data and send them to technical support, support@flashphoner.com.
Any help in understanding and fixing this issue would be greatly appreciated.
Cheers,
Taylor