lexer FIX Message (Financial Information eXchange)

dimon40001 · Post by **dimon40001** » 04.08.2022 20:31

Here's the new lexer I created.

Description:

https://github.com/dimon40001/cudatext- ... /README.md

Download from GitHub:

https://github.com/dimon40001/cudatext- ... x-protocol

Lexer supports Code Tree and allows easy search for specific FIX tags.

Feedback is welcome as usual =)

Suggestions / pending TODO items
* make 8=... FIX start to be the start of the range but keep the Name format for the Code Tree to take everything from the start of the line (this will require extracting 8= as a separate token distinct from Header-Key token)
* highlight timestamps (which format? `yyyy-MM-dd` and `hh:mm:ss.sss`?)
* highlight full lines containing 35=1, 35=0 as "Comments" style to make them less visible as these are just heartbeat messages usually appearting very many times in the log
* add optional header tags to be tokenized, currently only header mandatory tags are highlighted

main Alexey · Post by **main Alexey** » 05.08.2022 07:38

Thanks!
In the .cuda-lexmap file, these items are empty:
[map]
Separator=
Log-message=

Empty items force Cud's dialog "Lexer style mapping" to show on choosing the lexer.

main Alexey · Post by **main Alexey** » 05.08.2022 07:40

As I see in the picture on Github, lexer allows #-comments? then why these comments are not highlighted? why line-comment char is not configured?

Small ideas:
- maybe highlight date/time stamps?
they are not highlighted on the sample picture on the top.
- maybe highlight {texthere} ?

dimon40001 · Post by **dimon40001** » 05.08.2022 10:45

I added a section in my original post to gather all ideas and proposals so that we keep track of them and after some time I will make a second version.
I changed my job recently and started working with FIX so after some time it will be more clear to me what is practical and what is useful in terms of working with FIX, what to highlight, where the attention goes during the debugging, etc.

main Alexey wrote:Thanks!
Empty items force Cud's dialog "Lexer style mapping" to show on choosing the lexer.

What is the difference between "Assign None" (for instance `Separator --- -`) and "Unassign" buttons (displayed as `Separator --- ?` in the dialog window)?
I think I should put the "dash" in the .cuda-lexmap file to set the Style mapping to None, so that the color predefined in the lexer will be used. Is this correct?

main Alexey wrote: - maybe highlight date/time stamps?

Format could differ since every application has it's own date-time format.
Also my understanding was that in the FIX messages log analysis the chronological sequence of messages is more important than the timestamp (since there is an intensive flow of messages from client to server and back).

My other concern is that once we tokenize the date we become very closely bound to the log message format which could differ for different users.
Currently I use the whole text from the start of line until "8=..." as a group node in the Code Tree. If we tokenize the datetime, wouldn't it be keep such logic? The documentation on using special formatting like %s0... was a bit cryptic for me.

The question is if this is possible:
no-token token0 notoken text token1 ... even more text and tokens

If I define token1 to start the range, than can I have a Name format to be defined as "everything before token1 from the start of the line" to be displayed in the Code Tree?

The last consideration is if this is practical at all? Usually timestamp is in the fixed position in the log file. So we will end up with the whole wide column (let's say in blue color) acting more like a distractor for the user. Because timestamp will be each time, in each line.

main Alexey wrote: - maybe highlight {texthere}

What is this? I don't understand the {texthere} placeholder. Where is it coming from?

Regarding the "Comments" style. It's currenty not supported because there is no such thing as commonly recognized "comment" in the typical text log file.
I could add the "Comment" style and highligh whole lines containing heartbit FIX messages (these are with 35=0 or 35=1 tag and there are usually many of them which is not practical).

main Alexey · Post by **main Alexey** » 05.08.2022 11:48

What is the difference between "Assign None" (for instance `Separator --- -`) and "Unassign" buttons (displayed as `Separator --- ?` in the dialog window)?

'assign none' sets the 'dash' char to style, so color will be 'usual text color'.
'unassign' removes all content so Cud will show "Lexer style mapping" dialog. empty values are not OK for the release.

put the "dash" in the .cuda-lexmap file to set the Style mapping to None, so that the color predefined in the lexer will be used. Is this correct?

not 'color predefined in the lexer' but 'usual text color (w/o lexer)'.
so the same 'usual color' is used for all items with dash-char.

If I define token1 to start the range, than can I have a Name format to be defined as "everything before token1 from the start of the line" to be displayed in the Code Tree?

%sz0 means 'from token 0 to the end of line.
%sl0 means 'from token 0 to begin of line.
%s0 mean 'token 0'.
%s-1 means 'token previous'.

I don't understand the {texthere} placeholder. Where is it coming from?

your sample file has some values in { } brackets, no?