record problem question

hyuk

Member
Hi,
It's live, but there's an issue where the video isn't actually saved, so I'd like to ask a few questions.
About 2 out of 10 cases are not saved on the server where the problem occurred.
It looks like on_record_hook wasn't even called.
The version in use is 5.2.1665.
What is special is that we are currently storing video on a NAS server (record_dir).
Temporary files (record_tmp_dir settings) are created on the server where WCS is installed.

So, I am in a situation where I am looking for a problem by making various analogies.
(The server in question is located in an area that is difficult to access, so the log cannot be easily checked.)

So, I would like to ask you a few questions.
1. If the network condition of the NAS server (record_dir) is bad, are temporary files not created on the Flashphoner installation server?
2. If the connection to the NAS server is temporarily lost, what error will be recorded in flashphoner.log in Flashphoner's server_logs?
3. If there is no storage space on the NAS server, what error will be recorded in flashphoner.log in Flashphoner's server_logs?
4. Is there a hook like on_record_hook that can receive feedback on the problem when errors in 2 and 3 occur?
 

Max

Administrator
Staff member
Good day.
1. If the network condition of the NAS server (record_dir) is bad, are temporary files not created on the Flashphoner installation server?
WCS does not check any network conditions when writing stream recordings to disk. It even does not know what the disk is: hard drive or network mounted drive. Linux file system should care about it. So temporary files are always created in record_tmp_dir if the folder exists and writable by flashphoner user.
2. If the connection to the NAS server is temporarily lost, what error will be recorded in flashphoner.log in Flashphoner's server_logs?
The NAS server is just a folder for WCS, so, if it is not available, WCS fails to write file to the folder. In this case, a message in server logs will be about file write failure
3. If there is no storage space on the NAS server, what error will be recorded in flashphoner.log in Flashphoner's server_logs?
Before start recording, WCS checks if a minimal disk space is available (at least 1 Gb by default): Minimal available disk space checking. If the disk space is not enough, recording will not start with Not enough available disk space message
4. Is there a hook like on_record_hook that can receive feedback on the problem when errors in 2 and 3 occur?
No, there is no such hook
 

hyuk

Member
Thank you for quick response.
If record_dir is unavailable but record_tmp_dir is available, will a file be created only in record_tmp_dir?
 

Max

Administrator
Staff member
If record_dir is unavailable but record_tmp_dir is available, will a file be created only in record_tmp_dir?
Yes. In this case, there should be a message in server logs
Code:
Problem to move file /tmp/stream-ca00d7d0-52c2-11ee-ba8e-e9beb144939a-94brlh973bo9nlpe228922hort.mp4 to /usr/local/FlashphonerWebCallServer/records/stream-ca00d7d0-52c2-11ee-ba8e-e9beb144939a-94brlh973bo9nlpe228922hort.mp4
and recording file will stay in tmp dir.
 

hyuk

Member
A problem arose during the repro test. Temporary files are not being created at all on the server. Can you tell me what the problem is?
It seems that if I restart flashphoner, it will be saved normally, but I left it without restarting because it seemed like I needed to check for the problem.

When I try to attach the report file, it says it is too large and won't attach.

When I clicked the "report" button above to attach it, the following error appears.
Cообщение <1694743251.6503bad34c016@flashphoner.com> для support@flashphoner.com не было доставлено через ApMailer\Smtp. Ошибка: 102: На этапе "Авторизация" ожидался код 235, но сервер вернул 535 5.7.8 https://support.google.com/mail/?p=BadCredentials lg15-20020a170906f88f00b0098e34446464sm1748617ejb.25 - gsmtp
 

Max

Administrator
Staff member
When I try to attach the report file, it says it is too large and won't attach.
Place the archive to a cloud drive (Google Drive, OneDrive, Yandex.Disk etc) and send the link using this form. If sending via form does not work, please send the link to support@flashphoner.com with this topic URL in Subject field
 
Last edited:

Max

Administrator
Staff member
Your report contains no server configuration. Please collect it using report.sh script: Getting logs with report.sh script
Code:
sudo ./report.sh --sysinfo --conf --tar
In the logs we see you're recording streams published via RoomApi in VP8 codec. In this case, WEBM container is used by default, and no temporary files are created, this container is recorded directly to a record dir.
So we recommend to refactor your flow and set the record dir to a local hard drive on server (preferably SSD), and then copy recording files to the NAS using rsync. Please also note that if you're recording RoomApi rooms, a special multiple recorder is working: Multiple stream recording to one file with subsequent mixing. It uses a separate hook script on_multiple_record_hook.sh and mixes a room streams to one file.
 

hyuk

Member
Temporary files have been created continuously from before.
In fact, we are testing with dual servers, but temporary files are still being created normally on other servers.
record_tmp_dir=/usr/local/FlashphonerWebCallServer/records (local hard drive)

For reference, not only the temporary file but also record_hook is not called.

Recording is done one by one for each stream.
So I don't use on_multiple_record_hook.sh.
We will forward the file back to you to convey your server settings.
 
Last edited:

Max

Administrator
Staff member
Temporary dir is not used when recording streams in VP8, WEBM files are recording directly to the record dir. So if it's not available recording file will not be created at all, and hook script will not be called. Please do not use a network drive as we recommended above.
 

hyuk

Member
Since I didn't get the answer I wanted, I restarted the problematic webcallserver.
As I mentioned above, when I restart the webcallserver, a temporary file is created normally and the record_hook operates.
I don't understand why it doesn't work when using webm.
I am vp8, temporary files are created and record_hook works.

1694765056961.png



A temporary file was created as shown below.
1694764881629.png


Just in case, I'll send you the logs that are currently operating normally...
 

Max

Administrator
Staff member
You can use MKV container for VP8 streams: MP4, WebM, MKV containers support. But browsers cannot play MKV directly.
We checked jstack.log from previous reports and did not found any suspicious thread locks. If the problem occur again, it would be ideal to get SSH access to the server.
Anyway, it is a bad practice to record files to a network attached drive directly. Even if NAS is available, the operation may be too slow. So consider to use rsync or another similar tool to move recordings to the NAS.
 

hyuk

Member
SSH information has been sent by report.
The issue where live video cannot be saved has been reproduced on the server since about 4 hours ago.
 

Max

Administrator
Staff member
We checked the server.
Seems like the server is too weak to write more then 1 file simultaneosly (only 2 vCPU), so recording queues are grow. In case of room streams recording, there can be a much streams in room, so use more powerful server.
Also we see a blocked recording threads. This can occur if a NAS is temporarily lost, and file system does not return an error. Please change your flow: set record_dir to a local server folder (SSD is strongly recommended), then syncronize the folder content to an external NAS by on_record_hook script for example. Note that is is recommended to start rsync in a separate process to prevent a long waiting. We recommend rsync or other similar tool because those tools support error correction when connection to NAS is temporarily lost.
Now the server has to be restarted due to blocked threads.
 

hyuk

Member
The above server is a temporary server created to reproduce the problem, so its specifications are not good.

Currently, there is a 15% chance that the video is not being saved on a server that we cannot access, and I think this may be because some recording threads are blocked. (For reference, the server specifications of the server where the problem occurred are very good. However, there may have been a temporary CPU load because transcoding was being done separately.)

How many recording threads are running?

Is rebooting the WCS the only way to revive blocked recording threads?

Is there any way for us to find blocked recording threads?
I'm asking because I would like to reboot WCS if I can detect that recording threads are blocked due to CPU load.
 

Max

Administrator
Staff member
How many recording threads are running?
4 by default. This value can be changed: Recording perfomance tuning under high load
Is rebooting the WCS the only way to revive blocked recording threads?
Yes.
I'm asking because I would like to reboot WCS if I can detect that recording threads are blocked due to CPU load.
Seems like threads blocking caused not by CPU load, but by a network drive unavailablity.
WCS rebooting periodically is not a good solution. Please change the records dir to a local folder on a fast SSD. Then move a ready recording files to a network location. More IOPs, faster recording, less thread blocking issues.
 

hyuk

Member
I will consider using SSD local storage.
"Also we see a blocked recording threads."
Could you please share how to check blocked recording threads?
 
Top