----------------------------------------------------------------------------------
@MSGID: 1@dont-email.me> cf08aa18
@REPLY: 1@dont-email.me> 7b76d6b8
@REPLYADDR pozz <pozzugno@gmail.com>
@REPLYTO 2:5075/128 pozz
@CHRS: CP866 2
@RFC: 1 0
@RFC-Message-ID: 1@dont-email.me>
@RFC-References: 1@dont-email.me>
1@dont-email.me> 1@dont-email.me> 1@dont-email.me>
2@dont-email.me> 1@dont-email.me>
@TZUTC: 0200
@PID: Mozilla/5.0 (Windows NT 10.0; Win64; x64;
rv:102.0) Gecko/20100101 Thunderbird/102.14.0
@TID: FIDOGATE-5.12-ge4e8b94
Il 08/08/2023 18:54, David Brown ha scritto:
> On 08/08/2023 17:14, pozz wrote:
>> Il 08/08/2023 16:31, David Brown ha scritto:
>>> On 08/08/2023 14:59, pozz wrote:
>>>> Il 08/08/2023 12:27, David Brown ha scritto:
>>>>> On 07/08/2023 23:28, pozz wrote:
>>>> [...]
>>>>>> What do you suggest?
>>>>>>
>>>>>> PS: In the past I read only a few posts regarding Linux
>>>>>> development, even if it`s for embedded devices. However I don`t
>>>>>> know how to ask questions related to linux development, I noticed
>>>>>> Usenet linux groups are somewhat dead.
>>>>>
>>>>> I don`t know what kind of information you are needing, but an easy
>>>>> option might be to have the python service regularly write out a
>>>>> json format file with the current status or other information. The
>>>>> web app can have Javascript that regularly reads that file and
>>>>> handles it on the user`s web browser. And if you want to go the
>>>>> other way, your Python code can use "inotify" waits to see file
>>>>> writes from the web server.
>>>>
>>>> Sincerely I don`t like your solution. First of all, you are writing
>>>> regularly on a normal file in the filesystem. Ok, maybe I can use a
>>>> tmpfs filesystem in RAM.
>>>>
>>>
>>> That would be the normal choice, yes.
>>>
>>>> Another issue I see is synchronization. Without a sync mechanism,
>>>> the reader could read bad data, because the writer is writing to it.
>>>>
>>>
>>> You typically handle this by writing to "status.tmp", then renaming
>>> (moving) it to "status.json", or whatever names you are using.
>>> Renaming a file like this is guaranteed atomic on Linux - anything
>>> attempting to open a handle to "status.json" will either get the old
>>> file (which is kept alive while the file descriptor is open) or the
>>> new file. This is not the first situation in which people wanted to
>>> avoid reading half-written files!
>>
>> Good thing to know.
>>
>> Just to better understand what happens. If reader opens status.json
>> just before the writer rename status.tmp to status.json, we will have
>> a process (the reader) that reads from the old version of
>> "status.json" instead of the new version that is really on the
>> filesystem?
>>
>
> Yes, exactly.
>
> A file in Linux exists independently from filenames. There can be many
> things pointing to a file, and the file exists until there are no more
> pointers. Usually these "pointers" are directory entries, but they can
> also be open file descriptors (which are actually visible as pseudofiles
> in the /proc filesystem).
Is this behaviour the same for whatever filesystem (ext2, fat, ...)?
What I don`t understand is what exactly happens under the hood.
Consider the following sequences:
- process W (writer) write version A to status.tmp
- process W rename status.tmp to status.json
- process W write version B to status.tmp
- process R (reader) open file status.json (version A)
- process W rename status.tmp to status.json
[Now all new open operations on status.json will get new version of data]
[process W could write/rename status.tmp/json 1000 times]
- after one hour (just to say), process R starts reading from the file
From what I understand, process R will get the full contents of version
A (even if it restarts reading changing file position many times). The
OS takes care of data A, because this "ghost file"[1] is in use.
Most probably, if the file size is small, the OS copy its contents in a
cache in RAM when process R open the file, so process R will read from
RAM and this explains why it will get the original version A content.
Anyway, in general the file could be any size, maybe 1GB. So I assume at
least some parts of version A data still remains in the HDD, even when
process W write/rename a new version.
Until process R doesn`t close the file, version A data are phisically on
the HDD, consuming part of its memory. Is it correct?
[1] Ghost because it can`t be read by any other process.
> So when you open the "status.json" file, you get that file, and it stays
> in existence at least until the file is closed. The new "status.tmp" is
> a different file. The rename just makes a new pointer to the new file,
> and erases the old pointer to the old file.
>
>> Consider that the reader could keep open the old status.json for a
>> long time. Does the OS guarantee that old data (maybe 1GB) can be read
>> even if a new file with new data is available?
>>
>
> Yes, as long as you hold the file descriptor open.
>
--- Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
* Origin: A noiseless patient Spider (2:5075/128)
SEEN-BY: 5001/100 5005/49 5015/255 5019/40 5020/715
848 1042 4441 12000
SEEN-BY: 5030/49 1081 5058/104 5075/128
@PATH: 5075/128 5020/1042 4441