Here's the new lexer I created.
Description:
https://github.com/dimon40001/cudatext- ... /README.md
Download from GitHub:
https://github.com/dimon40001/cudatext- ... x-protocol
Lexer supports Code Tree and allows easy search for specific FIX tags.
Feedback is welcome as usual =)
Suggestions / pending TODO items
* make 8=... FIX start to be the start of the range but keep the Name format for the Code Tree to take everything from the start of the line (this will require extracting 8= as a separate token distinct from Header-Key token)
* highlight timestamps (which format? `yyyy-MM-dd` and `hh:mm:ss.sss`?)
* highlight full lines containing 35=1, 35=0 as "Comments" style to make them less visible as these are just heartbeat messages usually appearting very many times in the log
* add optional header tags to be tokenized, currently only header mandatory tags are highlighted
lexer FIX Message (Financial Information eXchange)
-
- Posts: 21
- Joined: 05.05.2022 18:52
lexer FIX Message (Financial Information eXchange)
Last edited by dimon40001 on 05.08.2022 11:07, edited 4 times in total.
-
- Posts: 2236
- Joined: 25.08.2021 18:15
-
- Posts: 2236
- Joined: 25.08.2021 18:15
-
- Posts: 21
- Joined: 05.05.2022 18:52
I added a section in my original post to gather all ideas and proposals so that we keep track of them and after some time I will make a second version.
I changed my job recently and started working with FIX so after some time it will be more clear to me what is practical and what is useful in terms of working with FIX, what to highlight, where the attention goes during the debugging, etc.
I think I should put the "dash" in the .cuda-lexmap file to set the Style mapping to None, so that the color predefined in the lexer will be used. Is this correct?
Also my understanding was that in the FIX messages log analysis the chronological sequence of messages is more important than the timestamp (since there is an intensive flow of messages from client to server and back).
My other concern is that once we tokenize the date we become very closely bound to the log message format which could differ for different users.
Currently I use the whole text from the start of line until "8=..." as a group node in the Code Tree. If we tokenize the datetime, wouldn't it be keep such logic? The documentation on using special formatting like %s0... was a bit cryptic for me.
The question is if this is possible:
no-token token0 notoken text token1 ... even more text and tokens
If I define token1 to start the range, than can I have a Name format to be defined as "everything before token1 from the start of the line" to be displayed in the Code Tree?
The last consideration is if this is practical at all? Usually timestamp is in the fixed position in the log file. So we will end up with the whole wide column (let's say in blue color) acting more like a distractor for the user. Because timestamp will be each time, in each line.
Regarding the "Comments" style. It's currenty not supported because there is no such thing as commonly recognized "comment" in the typical text log file.
I could add the "Comment" style and highligh whole lines containing heartbit FIX messages (these are with 35=0 or 35=1 tag and there are usually many of them which is not practical).
I changed my job recently and started working with FIX so after some time it will be more clear to me what is practical and what is useful in terms of working with FIX, what to highlight, where the attention goes during the debugging, etc.
What is the difference between "Assign None" (for instance `Separator --- -`) and "Unassign" buttons (displayed as `Separator --- ?` in the dialog window)?main Alexey wrote:Thanks!
Empty items force Cud's dialog "Lexer style mapping" to show on choosing the lexer.
I think I should put the "dash" in the .cuda-lexmap file to set the Style mapping to None, so that the color predefined in the lexer will be used. Is this correct?
Format could differ since every application has it's own date-time format.main Alexey wrote: - maybe highlight date/time stamps?
Also my understanding was that in the FIX messages log analysis the chronological sequence of messages is more important than the timestamp (since there is an intensive flow of messages from client to server and back).
My other concern is that once we tokenize the date we become very closely bound to the log message format which could differ for different users.
Currently I use the whole text from the start of line until "8=..." as a group node in the Code Tree. If we tokenize the datetime, wouldn't it be keep such logic? The documentation on using special formatting like %s0... was a bit cryptic for me.
The question is if this is possible:
no-token token0 notoken text token1 ... even more text and tokens
If I define token1 to start the range, than can I have a Name format to be defined as "everything before token1 from the start of the line" to be displayed in the Code Tree?
The last consideration is if this is practical at all? Usually timestamp is in the fixed position in the log file. So we will end up with the whole wide column (let's say in blue color) acting more like a distractor for the user. Because timestamp will be each time, in each line.
What is this? I don't understand the {texthere} placeholder. Where is it coming from?main Alexey wrote: - maybe highlight {texthere}
Regarding the "Comments" style. It's currenty not supported because there is no such thing as commonly recognized "comment" in the typical text log file.
I could add the "Comment" style and highligh whole lines containing heartbit FIX messages (these are with 35=0 or 35=1 tag and there are usually many of them which is not practical).
-
- Posts: 2236
- Joined: 25.08.2021 18:15
'assign none' sets the 'dash' char to style, so color will be 'usual text color'.What is the difference between "Assign None" (for instance `Separator --- -`) and "Unassign" buttons (displayed as `Separator --- ?` in the dialog window)?
'unassign' removes all content so Cud will show "Lexer style mapping" dialog. empty values are not OK for the release.
not 'color predefined in the lexer' but 'usual text color (w/o lexer)'.put the "dash" in the .cuda-lexmap file to set the Style mapping to None, so that the color predefined in the lexer will be used. Is this correct?
so the same 'usual color' is used for all items with dash-char.
%sz0 means 'from token 0 to the end of line.If I define token1 to start the range, than can I have a Name format to be defined as "everything before token1 from the start of the line" to be displayed in the Code Tree?
%sl0 means 'from token 0 to begin of line.
%s0 mean 'token 0'.
%s-1 means 'token previous'.
your sample file has some values in { } brackets, no?I don't understand the {texthere} placeholder. Where is it coming from?