Generic Lexer Missing for rc/conf/config Files

All questions regarding lexer highlighting schemes are discussed here...
Random Void User
Posts: 104
Joined: 13.07.2024 21:19

Generic Lexer Missing for rc/conf/config Files

Post by Random Void User »

Cud lacks a generic lexer for "rc" and "conf" files. These are common across *nix and macOS.

The file name matches "app.rc" or "apprc" or "app.conf" or just "config" in nearly all cases. I often find myself using "Apache config" lexer as the closest match. I would rather use a generic rc/conf lexer.

Typical rc/conf files have no stanzas like INI sections; they are just line-based.

Sometimes the file mirrors command line switches. An example is miller where the initial "--" on each line is optional. Again, no generic lexer exists for this simple format. I used "Bazelrc" lexer.

The "most" pager has a typical, simple rc config, with only minor, extra tweaks.

The ingo library for Go language parses a typical rc format.

I could go on with many other examples. Thanks!
main Alexey
Posts: 2931
Joined: 25.08.2021 18:15

Re: Generic Lexer Missing for rc/conf/config Files

Post by main Alexey »

I cannot access linux.die.net, blocked. can you post here example rc file which shows all syntax things?

'Bash script' lexer is not ok here?
'Properties' lexer is not ok here?
Random Void User
Posts: 104
Joined: 13.07.2024 21:19

Re: Generic Lexer Missing for rc/conf/config Files

Post by Random Void User »

Perhaps, but it's frustrating that Cud does not auto-lex them. I need detection in user.json or manual commands. I don't see the same kind of problem in nano or micro. They lex these files automatically.

True, several Cud lexers have overlapping features. Maybe "Properties" lexer is close. I would rename it "conf/rc/properties" if that's the intent. In the English world, we speak of configuration files, sometimes rc files, but almost never "properties" files.

However I find re-purposed lexers unsatisfying. The typical rc/conf format easily deserves its own. Re-purposing takes knowledge of both languages to implement. The average user (non-developer) says, "This is a conf file, not a bash script," a logical response.

The limited "detect" syntax in user.json may not work for "apprc" files lacking a dot suffix. The match is "*rc" not "*.rc" regex.

Trying various lexers sometimes fails with values having hyphens:

Code: Select all

...
settingone = whatever
settingtwo = my-special-value-here
settingthree = a_value_with_underscores
...
Lexers treat the hyphenated value (two) differently than non-hyphenated values (one, three). All should color the same in a conf.

The kitty.conf format has variety. The "action" names have hyphens, other things have underscores. One difference from other conf/rc formats is a lack of equal signs.

I wonder if Cud might adopt magic comments like Vim? Then the file could specify its own lexer, not relying on limited user.json detection regex matching.
Random Void User
Posts: 104
Joined: 13.07.2024 21:19

Re: Generic Lexer Missing for rc/conf/config Files

Post by Random Void User »

Code: Select all

% This is an example configuration file that provides a `less' emulation 
% for MOST.
%
% The format is simple: 
%       setkey <function-name> <key-name>
%       unsetkey <key-name>
%       color <object-name> <fg> <bg>
%
% Here <key-name> must be enclosed in double quotes.  '^' represents Ctrl.
% The function name is not in double quotes.
% 
%
% When MOST starts up, it looks for a keymap file given by the environment 
% variable MOST_INITFILE.  If that environment variable does not exist, 
% MOST will look in the users HOME directory for .mostrc on Unix systems
% and MOST.RC on VMS systems.
%
% /usr/share/doc/most/most.rc lists default MOST keybindings.
%
%
% The following functions may be used in custom keymaps:
% 
% edit
% suspend
% toggle_lock
% next_file
% toggle_options
% extended_key
% toggle_case
% delete_file
% bob
% delete_window
% digit_arg
% down
% edt_backward
% edt_find
% edt_find_next
% edt_forward
% edt_line
% edt_page
% eob
% exit
% find_file
% find_next
% goto_line
% goto_mark
% goto_percent
% help
% one_window
% other_window
% page_down
% page_left
% page_right
% column_right
% column_left
% page_up
% redraw
% search_backward
% search_forward
% set_mark
% show_time
% tail_mode
% toggle_width
% two_windows
% up



% Color settings

% color normal lightgray black
color normal black white
color status yellow black
color underline brightgreen black
color overstrike brightred black

% Keybindings

unsetkey "^K"
setkey up "^K"

unsetkey ":"
setkey next_file ":]"
setkey find_file ":e"
setkey toggle_options ":o"
setkey toggle_case ":c"
setkey delete_file ":x"
setkey exit ":q"

setkey down "e"
setkey down "E"
setkey down "^N"
setkey up "u"
setkey up "^Y"
setkey up "^P"
setkey page_down "f"
setkey page_down "^F"
setkey page_up "b"
setkey page_up "^B"
setkey other_window "z"
setkey other_window "w"
setkey search_forward "/"
setkey search_backward "?"
setkey bob "p"
setkey goto_mark "'"
setkey find_file "E"
setkey edit "v"
Random Void User
Posts: 104
Joined: 13.07.2024 21:19

Re: Generic Lexer Missing for rc/conf/config Files

Post by Random Void User »

Note conf/rc filenames may start with a literal dot. So the possibilities include

Code: Select all

app.rc   | .app.rc
apprc    | .apprc
app.conf | .app.conf
config   | .config
         | .app
These are typical patterns in *nix and macOS. The .app pattern is amazingly common. So is vanilla "config" distinguished only by folder location in $XDG_CONFIG_HOME or a project folder of some sort.
main Alexey
Posts: 2931
Joined: 25.08.2021 18:15

Re: Generic Lexer Missing for rc/conf/config Files

Post by main Alexey »

I can make the new conf/rc lexer (and put it to addons-manager). let me know lexer rules:

1- comment start with % char. not # char like in Bash?
2- % char must be first char, or first non-space char in a line, or it can be comment after smth useful?
3- strings are double quoted, not single quoted?
4- what is escape method for quote in a quoted string? backslash like in C++?
5- need to hilite numbers or not?
5b- integers and floating nums? hex/octal nums have which format?


>I wonder if Cud might adopt magic comments like Vim?

This info is written in the wiki, part 'Encoding detection'.
wiki site has anti-AI problems, so instead of going to the wiki site, read it here
https://github.com/Alexey-T/CudaText/bl ... atext.wiki

Also, encoding detect can be supported by plugin "File Type Profile".
Random Void User
Posts: 104
Joined: 13.07.2024 21:19

Re: Generic Lexer Missing for rc/conf/config Files

Post by Random Void User »

An app called SimpleX.chat uses "settings.properties" so that's yet another file name pattern. It's rare on *nix. Note readline uses /etc/inputrc and ~/.inputrc both, which brings it close to the appname/.appname pattern, without actually using the app name as such.

The hash "#" is far more common than the percent sign "%" for comment lines. Yes it typically must be the first char on the line. I would just assume comments can also terminate lines containing data.

The most.rc format is not really important to me, as I use other pagers (moor, bat, less), but I wanted to close the loop on most. And its example shows that probably, no single, generic lexer can handle all conf/rc file types. Covering all variations might take two or three generic lexers. Maybe "conf-rc-hash-comments" and "conf-rc-percent-comments" could work as lexer names. Some confs also use double slash (//) to imitate C language, so that lexer could be "conf-rc-C-comments" for example.

I suggest opening various conf/rc files in micro, nano, and neovim with syntax highlighting on. Get some ideas that way. Nano needs work to install a full suite of syntax files. Neovim needs config work to get there too, but you can find pre-made setups like LunarVim.

I like raw booleans (t, f, T, F, true, false, True, False, TRUE, FALSE, 1, 0) and numbers (0, 1, 2, 3.14159, 4.1234e12) to be colored. If quoted, then they become strings for lexing purpses, like any other string. I don't know about escaping special characters. Not many conf/rc formats support such strings, but some do that define regular expressions.

Thanks!
main Alexey
Posts: 2931
Joined: 25.08.2021 18:15

Re: Generic Lexer Missing for rc/conf/config Files

Post by main Alexey »

I posted link to ready lexer, removed it. I want to remake it as lite lexer.
main Alexey
Posts: 2931
Joined: 25.08.2021 18:15

Re: Generic Lexer Missing for rc/conf/config Files

Post by main Alexey »

Lite lexer is ready, I made it within 10 mins.
https://github.com/CudaText-addons/Cuda ... /Config-rc
copy *.cuda-litelexer file to 'data/lexliblite'.
please test.
it is easily changeable by you: comment kinds, special words, special symbols, etc.
Random Void User
Posts: 104
Joined: 13.07.2024 21:19

Re: Generic Lexer Missing for rc/conf/config Files

Post by Random Void User »

Done, installed by Addons Manager for testing (I made a .zip).

Many "rc" files look like ".manglrc" or "shellcheckrc" or the name can be "conf" alone. So change

Code: Select all

"files": "*.rc;*.conf;config;.config"
into

Code: Select all

"files": "*rc;conf;*.conf;config;*.config"
I assume asterisk means "zero or more characters" so "*.config" matches ".config" by itself, and "*.conf" matches ".conf" by itself as the zero-length corner-cases.

Note ".bashrc" matches "*rc" but needs the bash lexer instead. I don't know how Cud might prioritize it.
Post Reply