qualiu@mypc /cygdrive/c/Users/qualiu
$ ~/tool/nin.cygwin
Get difference-set(not-in-latter) for first file/pipe; Or intersection-set with latter file/pipe. by LQM:
-u [ --unique ] Get unique results, discard self/mutual duplicate lines/keys (key = captured groups[1] if set 1~2 Regex patterns).
-m [ --intersection ] Get mutual lines/keys intersection in 2 files or file-with-pipe (default is exclusive: 'not-in-latter').
-i [ --ignore-case ] Ignore case for plain text matching and Regex matching.
-n [ --out-not-captured ] Also output not-captured keys/lines of Regex pattern in first file/pipe.
-p [ --percentage ] Output percentages of keys/lines at each line head, and sort by percentages.
-w [ --out-whole-line ] Output matched lines other than keys (key = captured groups[1] of Regex pattern).
-a [ --ascending ] Ascending sort output by lines or captured-keys or percentages.
-d [ --descending ] Descending sort output by lines or captured-keys or percentages.
-k [ --stop-at-count ] arg Stop if the matched count of a key/line > [N] when ascending output, or if count < [N] when descending output.
-K [ --stop-percentage ] arg Stop if the percentage of a key/line > [P%] when ascending output, or if percentage < [P%] when descending output.
-A [ --no-any-info ] Not output any info, no warnings no summary (if no errors), only pure result (Please always use -PAC or -PC).
-I [ --info-normal-out ] Output summary info to stdout (default is to stderr).
-M [ --no-summary ] Not output summary info.
-O [ --out-not-0-sum ] Output summary only if the results count is not 0.
-C [ --no-color ] No color for output (it's better to not add color if have subsequent matching or processing).
--colors arg Set fore_back colors for -t/-e/-x;d/f/p;m/u like: 'Red' or 't=Red,x=Yellow,e=Green' or 't = red + Yellow_Blue, x = Cyan'.
--keep-color Keep color of output result for Windows/MinGW - to be uniform color style with Cygwin/Linux/MacOS.
--unix-slash arg Set 1 to output uniform forward slash '/' on Windows + MinGW + Cygwin, like 'c:/' instead of '/c/' or '/cygdrive/c/'.
--to-stderr Output result to stderr. Default: result -> stdout, error/warn/info/verbose -> stderr.
-P [ --no-percent ] Not output percentages (Overwrite --percentage).
--sum Sum accumulative counts and percentages(if used -p) for each key/line.
--not-warn-bom Not output BOM warnings when reading BOM files which BOM header bytes != 0xEFBBBF.
-H [ --head ] arg Output top [N] rows of whole output if N > 0; Skip top [N] lines if N < 0; [N] = 0 means not output.
-T [ --tail ] arg Output bottom [N] rows of whole output if N > 0; Skip bottom [N] lines if N < 0; [N] = 0 means not output.
-J [ --jump-out ] Jump out (stop and exit) if has set -H [N] and already has output [N] lines.
--timeout arg Maximum waiting seconds to stop and exit. No limit if value <= 0. Default = 0.000 s.
-S [ --switch-first ] Switch positions (first/latter roles) of 2 files or file/pipe (also will switch their Regex patterns).
-Z [ --skip-last-empty ] Skip last empty line in first/latter file.
-x [ --has-text ] arg Line must contain this normal/plain text (Can use meanwhile: -t, -x, --nt, --nx).
--nx arg Line must not contain normal/plain text: Exclude/Skip rows.
-t [ --text-match ] arg Regex pattern for line text must match (Can use meanwhile: -t, -x, --nt, --nx). Use -t value to filter even if used -e.
--nt arg Regex pattern for lines must not match: Exclude/Skip rows.
-e [ --enhance ] arg Regex pattern to color output, inferior to: -t -x. Use merged Regex value of "(-t)|-e" to enhance if used both -t and -e.
-Y [ --not-from-pipe ] Force reading from files other than pipe (to avoid reading pipe if running in another command).
-c [ --show-command ] Show command line, you can append text for debug, or extraction after -c (if append text, -c and text must be last).
--exit arg Change return value, format: [Number] or [Regex-or-Math]-to-[Exit-Code], like: '1' or '-?\d+-to-1' or 'lt0-to-1,255-to-1'.
--verbose Show parsed arguments, return value, time zone, BOM rows and EXE path, etc.
-h [ --help ] See usage and examples below. More detail: https://github.com/qualiu/msr
Usage: nin File1-or-pipe File2-or-/dev/null [Regex-capture1-pattern-1] [Regex-capture1-pattern-2] [Options like: -i -u -m -w -H 2 -t xxx --nt xxx]
All [Quoted Args Options] above are Optional, can be omitted.
If has set [Regex-capture1-pattern-N], Must have Regex capture group[1]: Simple examples like: "^(.+)$" or "(.+)" or "^(\S+)" or "^([^\t]+)" or "^(\w+)" etc.
If only set [Regex-capture1-pattern-1] then [Regex-capture1-pattern-2] will use the same.
If both of them not set, will use normal whole line text comparison: check lines in file1/pipe which not-in/in file2.
Example-1: /home/qualiu/tool/nin.cygwin daily-sample.txt selected-queries.txt "^([^\t]+)" "query = (.+)$" -u -i
Example-2: nin daily-sample.txt /dev/null -p -i
Example-3: cat daily-sample.txt | nin /dev/null -pi
Example-1 uses regex capture1 to get new queries: only in daily-sample.txt but not in the latter file (if use -m will get intersection set of the 2 files).
Example-2/3 are same: get unique(-u) lines in file and show each percentage(-p) with order.
Return value/Exit code($?) = matched line/key count in {first file/pipe} or {mutual intersection}.
But if return value = 0 and caught errors, will exit with return value = -1 (probably 255 on Linux/MacOS or 127 on MinGW which changed by shells like bash).
All error messages will be output to stderr. You can redirect them to stdout by appending 2>&1 to your command line.
Useful options: -H 20 -J, -H 0, -T 3, -k 30, -K 33.33, -T -1, -M, -S, -PAC, -i -u, -iuw, -iuwa, -ip, -ipa, -ipdw, -ium, iumw, -iwn, -im, -imw, -ipdPAC
-m -u : Get unique mutual intersection.
-p -d : Get top distributions/percentages and sort by count/percentages with descending order.
-w -n : Skip lines/keys both matched in latter + first files/pipe, output other keys' lines + non-matched lines (like description/comments) in first.
nin treats Windows nul as same as /dev/null on MinGW / Cygwin / Linux / MacOS.
One important feature: nin does not change the original line order even if used unique(-u) if no sorting of -p/-d/-a/etc.
Frequent use cases as Quick-Start: Use -PAC or -PC to get pure output result.
nin my.txt nul -u -i : output unique lines in my.txt ignore case.
cat my.txt | nin nul -ui : output unique lines in my.txt ignore case.
nin my.txt nul "^(\w+)" -u -i : output unique keys (captured words at each line begin) in my.txt ignore case.
nin my.txt nul "^(\w+)" -u -wi : output unique lines (lines of the captured keys) in my.txt ignore case.
nin my.txt other.txt "(my-capture1)" "(other-capture1)" -u : output captured keys in my.txt not in other.txt.
nin my.txt other.txt "(my-capture1)" "(other-capture1)" -u -S : output captured keys in other.txt not in my.txt.
nin my.txt other.txt "(my-capture1)" "(other-capture1)" -u -m : output mutual keys in other.txt and my.txt.
nin error.log nul "(\w*Exception)" -pd -H 30 : Get error categories, distribution and percentage, output top 30 errors.
nin error.log nul "(\w*Exception)" -pd -H 30 -I > report.txt : Save top 30 errors + summary info to report.txt.
nin error.log nul "(\w*Exception)" -pd -H 30 -PAC : Get top 30 errors of raw text without percentages and summary info.
nin my-config.ini exclude.csv "name = (\w+Exception)" "(\w+Exception)" -iwn > new-config.ini : Output whole lines in my-config.ini except lines also captured in exclude.csv.
nin -h -C | nin nul "^\s{2}(-+\S+)\s+" -w --nt help : Get all command options of nin and output with original order.
nin -h -C | nin nul "^\s{2}-(\w)\s+" -wa --nt help : Get all single letter command options of nin and output with ascending order.
nin -h -C | nin nul "^\s{2}-(\w)\s+" -wpdi : Get percentages of nin single letter command options.
nin -h -C | nin nul "^\s{2}-(\w)\s+" -wpdi -k 2 : Get percentages of nin single letter command options which matched count >= 2.
nin -h -C | nin nul "^\s{2}-(\w)\s+" -wpdi -K 5.0 : Get percentages of nin single letter command options which percentage >= 5%.
nin -h -C | nin nul "^\s{2}-(\w)\s+" -wpdi -k 2 -K 5.0 -P : Get percentages of nin single letter command options: count >= 2 and percentage >= 5% without percentage info.
One limitation: Cannot process Unicode files or pipe for now; Fine with UTF-8/ANSI/etc.
Search usage like: nin -h | msr -i -t return.+value or nin -hC | msr -it "Summary|Jump|Sort" -x out -U 2 -D2 or nin | msr -ix switch -t Regex -e "latter|first"
You can preset env: MSR_EXIT, MSR_OUT_INDEX, MSR_NO_COLOR, MSR_COLORS, MSR_OUT_FULL_PATH, MSR_NOT_WARN_BOM, MSR_SKIP_LAST_EMPTY, MSR_KEEP_COLOR, MSR_UNIX_SLASH for --unix-slash / --keep-color / etc.
With msr.cygwin more powerful to load files/read pipe, extract/transform, pre/post-processing: https://github.com/qualiu/msr
Example: Get insensitive unique paths + descending sort-by-percentage to show top 2 duplicate paths + Merge trimmed one line paths to new $PATH:
msr -z "$PATH" -t "/*?\s*:\s*" -o "\n" -aPAC | nin nul "(\S+.+)" -i -u
msr -z "$PATH" -t "/*?\s*:\s*" -o "\n" -aPAC | nin nul "(\S+.+)" -i -u -d -p -k 2
msr -z "$PATH" -t "/*?\s*:\s*" -o "\n" -aPAC | nin nul "(\S+.+)" -i -u -PAC | msr -S -t "[\r\n]+(\S+)" -o ':\1' -aPAC
As a portable cross platform tool, nin has been running on: Windows / MinGW / Cygwin / Ubuntu / CentOS / Fedora / Darwin / FreeBSD
Aperiodic updates: https://github.com/qualiu/msr , more tools: https://github.com/qualiu/msrTools + https://github.com/qualiu/msrUI + https://github.com/qualiu/vscode-msr
Call@Everywhere: Add nin.cygwin to system environment variable PATH with nin.cygwin directory like(copy or rename nin.cygwin to nin): export PATH="$PATH:/home/qualiu/tool"
or by alias : alias nin='/home/qualiu/tool/nin.cygwin'
or by link : ln -sf '/home/qualiu/tool/nin.cygwin' /usr/bin/nin
or copy to system : cp -ap '/home/qualiu/tool/nin.cygwin' /usr/bin/nin
qualiu@mypc /cygdrive/c/Users/qualiu
$