Server keeps crashing with load

Discussion in 'Web Call Server 5' started by Azhar Ali, Feb 7, 2020.

  1. Azhar Ali

    Azhar Ali Member

    Hi,
    We have a server running on version 5.2.213. We have created an image of that server and wanted to update to latest version. We started a new instance using the image and followed the documentation to update to the latest version.
    Everything works on demo, but when we switch that server on our application which generates around 600 viewers on the stream, it crashes and have to restart the webcallserver.
    I have sent the SSH details in the email if you can help us find out whats wrong. If we can set a time, we can live generate the load as well.
    Our setup is running on google cloud and hardware is the same on both servers.

    Regards
    Azhar
  2. Max

    Max Administrator Staff Member

    Hello
    Currently we recommend the following tuning for high-load and load testing cases:
    https://docs.flashphoner.com/display/WCS52EN/Memory management in Java#MemorymanagementinJava-TheZGarbageCollector
    1. Install JDK 12
    2. Enable ZGC - garbage collector
    3. Configure huge pages
    4. Increase heap in conf/wcs-core.properties
    Code:
    -Xmx32g -Xms32g
    Please perform the tuning and run your test again.
    Let know if you have any issues.

    If you encounter crash or another unexpected behavior, please provide server date-time of the crash and describe how it looks for users (can't play streams, can't connect to the server, server process is not running, etc).
  3. AzharAli83

    AzharAli83 New Member

    Hello Max,

    I have followed the instructions but server does not start now
    I added the following in wcs-properties.
    Code:
    # ZGC
    -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xms24g -Xmx24g -XX:+UseLargePages -XX:ZPath=/hugepages
    Also did ran following commands
    Code:
    mkdir /hugepages
    echo "echo 13824 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages" >>/etc/rc.local
    echo "mount -t hugetlbfs -o uid=0 nodev /hugepages" >>/etc/rc.local
    chmod +x /etc/rc.d/rc.local
    systemctl enable rc-local.service
    systemctl restart rc-local.service
    
    update the JVM Options in wcs-core to
    Code:
    -Xmx32g -Xms32g
    
    Not sure if the above settings are correct for my server or not.
    My server is on google with 16 vCPU and 64GB RAM
    BTW why no emails from flashphoner are not delivered to outlook.com I spent several days trying to recover the password but had to contact the support. I also never got any emails about you adding a reply.
    I have already sent the ssh access to support but if you don't mind having a look at the settings, I would be grateful.
  4. Max

    Max Administrator Staff Member

    Good day.
    Unfortunately, from Monday the server returns "Permission denied (publickey,gssapi-keyex,gssapi-with-mic)" in response on SSH credentials you've sent. Please check and send us actual credentials.
    For some reason, Outlook.com places emails from flashphoner.com to Spam folder. So please check and move letters to Inbox, this should help to train Outlook mail filter.
  5. Max

    Max Administrator Staff Member

    We resolved access problem on our side and checked your server. You should change the following in wcs-core.properties file:
    1. Remove this line to escape settings dubbing
    Code:
    -Xmx32g -Xms32g
    2. Comment the following lines
    Code:
    -XX:+UseConcMarkSweepGC
    -XX:+UseCMSInitiatingOccupancyOnly
    -XX:CMSInitiatingOccupancyFraction=70
    -XX:+PrintGCDateStamps
    -XX:+PrintGCDetails
    
    3. Change the following line
    Code:
    -Xloggc:/usr/local/FlashphonerWebCallServer/logs/gc-core-
    to
    Code:
    -Xlog:gc*:/usr/local/FlashphonerWebCallServer/logs/gc-core-:time
    Then, server should start correctly.
  6. Azhar Ali

    Azhar Ali Member

    Hello Max,
    Thanks for the info. After making those changes server did start.
    I started a test and streamed a video at 10-Feb-2020 9:17am UTC, it streamed fine and as more viewers started to connect to it, it crashed and webcallserver was crashed. Publisher and viewers both got Failed status notification.
    After that demo systems also stopped worked and i had to restart the webcall service.
    If there is only 2-3 users connected to the stream, it does not crash.
  7. Azhar Ali

    Azhar Ali Member

    I don't get them in Spam either. I also tried to register an account with gmail and it also didn't deliver any email to spam or inbox. This use to work before.
  8. Azhar Ali

    Azhar Ali Member

    Hello Max,
    Just adding this just to make sure my two replies have not our issue unnoticed.
    Regards
    Azhar
  9. Max

    Max Administrator Staff Member

    Good day.
    We raised internal ticket WCS-2509 to investigate crash and let you know results.
    As workaround, we recommend rollback to build 5.2.477.
  10. Azhar Ali

    Azhar Ali Member

    thanks, that has resolved the issue and we are able to test it now.
  11. Max

    Max Administrator Staff Member

    Good day.
    The silent crash on stream encoding issue fixed in build 5.2.515. Please update and check.
    If any silent crash occurs again, please collect stdout log running server as follows
    Code:
    cd /usr/local/FlashphonerWebCallServer/bin
    ./webcallserver start standalone > /usr/local/FlashphonerWebCallServer/logs/stdout.log 2>&1 &
    
    then reproduce the crash and send us stdout.log file

Share This Page