Generic Lexer Missing for rc/conf/config Files
-
Random Void User
- Posts: 104
- Joined: 13.07.2024 21:19
Generic Lexer Missing for rc/conf/config Files
Cud lacks a generic lexer for "rc" and "conf" files. These are common across *nix and macOS.
The file name matches "app.rc" or "apprc" or "app.conf" or just "config" in nearly all cases. I often find myself using "Apache config" lexer as the closest match. I would rather use a generic rc/conf lexer.
Typical rc/conf files have no stanzas like INI sections; they are just line-based.
Sometimes the file mirrors command line switches. An example is miller where the initial "--" on each line is optional. Again, no generic lexer exists for this simple format. I used "Bazelrc" lexer.
The "most" pager has a typical, simple rc config, with only minor, extra tweaks.
The ingo library for Go language parses a typical rc format.
I could go on with many other examples. Thanks!
The file name matches "app.rc" or "apprc" or "app.conf" or just "config" in nearly all cases. I often find myself using "Apache config" lexer as the closest match. I would rather use a generic rc/conf lexer.
Typical rc/conf files have no stanzas like INI sections; they are just line-based.
Sometimes the file mirrors command line switches. An example is miller where the initial "--" on each line is optional. Again, no generic lexer exists for this simple format. I used "Bazelrc" lexer.
The "most" pager has a typical, simple rc config, with only minor, extra tweaks.
The ingo library for Go language parses a typical rc format.
I could go on with many other examples. Thanks!
-
main Alexey
- Posts: 2931
- Joined: 25.08.2021 18:15
Re: Generic Lexer Missing for rc/conf/config Files
I cannot access linux.die.net, blocked. can you post here example rc file which shows all syntax things?
'Bash script' lexer is not ok here?
'Properties' lexer is not ok here?
'Bash script' lexer is not ok here?
'Properties' lexer is not ok here?
-
Random Void User
- Posts: 104
- Joined: 13.07.2024 21:19
Re: Generic Lexer Missing for rc/conf/config Files
Perhaps, but it's frustrating that Cud does not auto-lex them. I need detection in user.json or manual commands. I don't see the same kind of problem in nano or micro. They lex these files automatically.
True, several Cud lexers have overlapping features. Maybe "Properties" lexer is close. I would rename it "conf/rc/properties" if that's the intent. In the English world, we speak of configuration files, sometimes rc files, but almost never "properties" files.
However I find re-purposed lexers unsatisfying. The typical rc/conf format easily deserves its own. Re-purposing takes knowledge of both languages to implement. The average user (non-developer) says, "This is a conf file, not a bash script," a logical response.
The limited "detect" syntax in user.json may not work for "apprc" files lacking a dot suffix. The match is "*rc" not "*.rc" regex.
Trying various lexers sometimes fails with values having hyphens:
Lexers treat the hyphenated value (two) differently than non-hyphenated values (one, three). All should color the same in a conf.
The kitty.conf format has variety. The "action" names have hyphens, other things have underscores. One difference from other conf/rc formats is a lack of equal signs.
I wonder if Cud might adopt magic comments like Vim? Then the file could specify its own lexer, not relying on limited user.json detection regex matching.
True, several Cud lexers have overlapping features. Maybe "Properties" lexer is close. I would rename it "conf/rc/properties" if that's the intent. In the English world, we speak of configuration files, sometimes rc files, but almost never "properties" files.
However I find re-purposed lexers unsatisfying. The typical rc/conf format easily deserves its own. Re-purposing takes knowledge of both languages to implement. The average user (non-developer) says, "This is a conf file, not a bash script," a logical response.
The limited "detect" syntax in user.json may not work for "apprc" files lacking a dot suffix. The match is "*rc" not "*.rc" regex.
Trying various lexers sometimes fails with values having hyphens:
Code: Select all
...
settingone = whatever
settingtwo = my-special-value-here
settingthree = a_value_with_underscores
...
The kitty.conf format has variety. The "action" names have hyphens, other things have underscores. One difference from other conf/rc formats is a lack of equal signs.
I wonder if Cud might adopt magic comments like Vim? Then the file could specify its own lexer, not relying on limited user.json detection regex matching.
-
Random Void User
- Posts: 104
- Joined: 13.07.2024 21:19
Re: Generic Lexer Missing for rc/conf/config Files
Code: Select all
% This is an example configuration file that provides a `less' emulation
% for MOST.
%
% The format is simple:
% setkey <function-name> <key-name>
% unsetkey <key-name>
% color <object-name> <fg> <bg>
%
% Here <key-name> must be enclosed in double quotes. '^' represents Ctrl.
% The function name is not in double quotes.
%
%
% When MOST starts up, it looks for a keymap file given by the environment
% variable MOST_INITFILE. If that environment variable does not exist,
% MOST will look in the users HOME directory for .mostrc on Unix systems
% and MOST.RC on VMS systems.
%
% /usr/share/doc/most/most.rc lists default MOST keybindings.
%
%
% The following functions may be used in custom keymaps:
%
% edit
% suspend
% toggle_lock
% next_file
% toggle_options
% extended_key
% toggle_case
% delete_file
% bob
% delete_window
% digit_arg
% down
% edt_backward
% edt_find
% edt_find_next
% edt_forward
% edt_line
% edt_page
% eob
% exit
% find_file
% find_next
% goto_line
% goto_mark
% goto_percent
% help
% one_window
% other_window
% page_down
% page_left
% page_right
% column_right
% column_left
% page_up
% redraw
% search_backward
% search_forward
% set_mark
% show_time
% tail_mode
% toggle_width
% two_windows
% up
% Color settings
% color normal lightgray black
color normal black white
color status yellow black
color underline brightgreen black
color overstrike brightred black
% Keybindings
unsetkey "^K"
setkey up "^K"
unsetkey ":"
setkey next_file ":]"
setkey find_file ":e"
setkey toggle_options ":o"
setkey toggle_case ":c"
setkey delete_file ":x"
setkey exit ":q"
setkey down "e"
setkey down "E"
setkey down "^N"
setkey up "u"
setkey up "^Y"
setkey up "^P"
setkey page_down "f"
setkey page_down "^F"
setkey page_up "b"
setkey page_up "^B"
setkey other_window "z"
setkey other_window "w"
setkey search_forward "/"
setkey search_backward "?"
setkey bob "p"
setkey goto_mark "'"
setkey find_file "E"
setkey edit "v"
-
Random Void User
- Posts: 104
- Joined: 13.07.2024 21:19
Re: Generic Lexer Missing for rc/conf/config Files
Note conf/rc filenames may start with a literal dot. So the possibilities include
These are typical patterns in *nix and macOS. The .app pattern is amazingly common. So is vanilla "config" distinguished only by folder location in $XDG_CONFIG_HOME or a project folder of some sort.
Code: Select all
app.rc | .app.rc
apprc | .apprc
app.conf | .app.conf
config | .config
| .app
-
main Alexey
- Posts: 2931
- Joined: 25.08.2021 18:15
Re: Generic Lexer Missing for rc/conf/config Files
I can make the new conf/rc lexer (and put it to addons-manager). let me know lexer rules:
1- comment start with % char. not # char like in Bash?
2- % char must be first char, or first non-space char in a line, or it can be comment after smth useful?
3- strings are double quoted, not single quoted?
4- what is escape method for quote in a quoted string? backslash like in C++?
5- need to hilite numbers or not?
5b- integers and floating nums? hex/octal nums have which format?
>I wonder if Cud might adopt magic comments like Vim?
This info is written in the wiki, part 'Encoding detection'.
wiki site has anti-AI problems, so instead of going to the wiki site, read it here
https://github.com/Alexey-T/CudaText/bl ... atext.wiki
Also, encoding detect can be supported by plugin "File Type Profile".
1- comment start with % char. not # char like in Bash?
2- % char must be first char, or first non-space char in a line, or it can be comment after smth useful?
3- strings are double quoted, not single quoted?
4- what is escape method for quote in a quoted string? backslash like in C++?
5- need to hilite numbers or not?
5b- integers and floating nums? hex/octal nums have which format?
>I wonder if Cud might adopt magic comments like Vim?
This info is written in the wiki, part 'Encoding detection'.
wiki site has anti-AI problems, so instead of going to the wiki site, read it here
https://github.com/Alexey-T/CudaText/bl ... atext.wiki
Also, encoding detect can be supported by plugin "File Type Profile".
-
Random Void User
- Posts: 104
- Joined: 13.07.2024 21:19
Re: Generic Lexer Missing for rc/conf/config Files
An app called SimpleX.chat uses "settings.properties" so that's yet another file name pattern. It's rare on *nix. Note readline uses /etc/inputrc and ~/.inputrc both, which brings it close to the appname/.appname pattern, without actually using the app name as such.
The hash "#" is far more common than the percent sign "%" for comment lines. Yes it typically must be the first char on the line. I would just assume comments can also terminate lines containing data.
The most.rc format is not really important to me, as I use other pagers (moor, bat, less), but I wanted to close the loop on most. And its example shows that probably, no single, generic lexer can handle all conf/rc file types. Covering all variations might take two or three generic lexers. Maybe "conf-rc-hash-comments" and "conf-rc-percent-comments" could work as lexer names. Some confs also use double slash (//) to imitate C language, so that lexer could be "conf-rc-C-comments" for example.
I suggest opening various conf/rc files in micro, nano, and neovim with syntax highlighting on. Get some ideas that way. Nano needs work to install a full suite of syntax files. Neovim needs config work to get there too, but you can find pre-made setups like LunarVim.
I like raw booleans (t, f, T, F, true, false, True, False, TRUE, FALSE, 1, 0) and numbers (0, 1, 2, 3.14159, 4.1234e12) to be colored. If quoted, then they become strings for lexing purpses, like any other string. I don't know about escaping special characters. Not many conf/rc formats support such strings, but some do that define regular expressions.
Thanks!
The hash "#" is far more common than the percent sign "%" for comment lines. Yes it typically must be the first char on the line. I would just assume comments can also terminate lines containing data.
The most.rc format is not really important to me, as I use other pagers (moor, bat, less), but I wanted to close the loop on most. And its example shows that probably, no single, generic lexer can handle all conf/rc file types. Covering all variations might take two or three generic lexers. Maybe "conf-rc-hash-comments" and "conf-rc-percent-comments" could work as lexer names. Some confs also use double slash (//) to imitate C language, so that lexer could be "conf-rc-C-comments" for example.
I suggest opening various conf/rc files in micro, nano, and neovim with syntax highlighting on. Get some ideas that way. Nano needs work to install a full suite of syntax files. Neovim needs config work to get there too, but you can find pre-made setups like LunarVim.
I like raw booleans (t, f, T, F, true, false, True, False, TRUE, FALSE, 1, 0) and numbers (0, 1, 2, 3.14159, 4.1234e12) to be colored. If quoted, then they become strings for lexing purpses, like any other string. I don't know about escaping special characters. Not many conf/rc formats support such strings, but some do that define regular expressions.
Thanks!
-
main Alexey
- Posts: 2931
- Joined: 25.08.2021 18:15
Re: Generic Lexer Missing for rc/conf/config Files
I posted link to ready lexer, removed it. I want to remake it as lite lexer.
-
main Alexey
- Posts: 2931
- Joined: 25.08.2021 18:15
Re: Generic Lexer Missing for rc/conf/config Files
Lite lexer is ready, I made it within 10 mins.
https://github.com/CudaText-addons/Cuda ... /Config-rc
copy *.cuda-litelexer file to 'data/lexliblite'.
please test.
it is easily changeable by you: comment kinds, special words, special symbols, etc.
https://github.com/CudaText-addons/Cuda ... /Config-rc
copy *.cuda-litelexer file to 'data/lexliblite'.
please test.
it is easily changeable by you: comment kinds, special words, special symbols, etc.
-
Random Void User
- Posts: 104
- Joined: 13.07.2024 21:19
Re: Generic Lexer Missing for rc/conf/config Files
Done, installed by Addons Manager for testing (I made a .zip).
Many "rc" files look like ".manglrc" or "shellcheckrc" or the name can be "conf" alone. So change
into
I assume asterisk means "zero or more characters" so "*.config" matches ".config" by itself, and "*.conf" matches ".conf" by itself as the zero-length corner-cases.
Note ".bashrc" matches "*rc" but needs the bash lexer instead. I don't know how Cud might prioritize it.
Many "rc" files look like ".manglrc" or "shellcheckrc" or the name can be "conf" alone. So change
Code: Select all
"files": "*.rc;*.conf;config;.config"
Code: Select all
"files": "*rc;conf;*.conf;config;*.config"
Note ".bashrc" matches "*rc" but needs the bash lexer instead. I don't know how Cud might prioritize it.