The goal of this tool is to filter files lines with a logic-like language. That is, there are some built in predicates that applies operations on lines.
Each predicate takes as input at least one variable/constant with a string (the considered line) and unifies a variable with the result of the operation or succeeds/fails.
Why? Because I often have to write the same python script to scan log files with a lot of text and extract results having a specific structure.
Assume you have a file called log.txt which contains something like
and you want to extract the AUCPR results and average them. With take you can quickly do so:
Output
Another example: extract the real value of the bash time and convert it into seconds. Suppose you have a file log.txt of the form
To do so:
Output
Install uv and execute take with uv run take (see below the options and/or use the -h flag).
Variables start with an uppercase letter while constants starts with a lowercase letter, are numbers, or are enclosed within single quotes. The execution idea is simple: each command starts with a line/1 predicate which assigns its argument to the content of the current file line (for instance line(L) assigns L to the content of the current line, since L is a variable. If L is a constant, it checks whether the current line is equal to the constant). Then, iteratively, it applies subsequent commands, in order of appearance, until failure. To print results, you can use the print/1 or println/1.
Available predicates:
- line(L): unifies L with the current file line. Note: each command must have line/1 in it
- print(L)/println(L): print the content of L (println/1 also adds a newline)
- startswith(L,P): true if L starts with P
- endswith(L,P): as startswith/2, but checks ends of the string
- length(L,N): true if L is of length N
- lt(L,N): true if L < N
- gt(L,N): true if L > N
- leq(L,N): true if L <= N
- geq(L,N): true if L >= N
- eq(L,N): true if L == N
- neq(L,N): true if L != N
- capitalize(L,C): C is the capitalized version of L, i.e., makes the first character as upper case and the rest lower case
- split_select(L,V,P,L1): splits L at each occurrence of V then L1 contains the split at position P, starting from 0. Fails if P is larger than the number of splits. Special split delimiters: V = space and V = tab
- replace(L,A,B,L1): replace the occurrences of the string A in L with B and unifies L1 with the results
- contains(L,A): true if the string unified with L contains the string unified with A, false otherwise
- strip(L,L1): removes leading and trailing whitespaces from L and unifies L1 with the result
- time_to_seconds(L,L1): converts a bash time of the form AmBs into seconds (example: L = 2m42.765s into L1 = 162.765)
You can also prepend not to predicates (except to line/1, print/1, and println/1) to flip the result.
You can pass arguments as strings by enclosing them into single quotes (e.g., 'Hello' will be treated as a string and not as a variable).
You can also aggregate the results of the applications of the predicates on the file with the option -a/--aggregate.
Available aggregates (some are self-explanatory):
- count: count the lines
- sum
- product
- average
- min
- max
- concat: concatenates the lines
- unique: filter unique lines
- first
- last
- sort_ascending
- sort_descending
If you want only the result of the aggregation and suppress the other output, you can use the flag -so/--suppress-output.
You can specify multiple aggregates by repeating the flag.
Assume the file is called f.txt.
Count the empty lines from a file: take -f f.txt -c "line(L), length(L,N), lt(N,1), println(L)" -a count -so
Assuming you have a file where the line contains results separated by spaces and you want to pick the second element of each line and sum all: take -f f.txt -c "line(L), split_select(L,space,1,L1), println(L1)" -a sum -so
Suggestions, issues, pull requests, etc, are welcome.
The program is provided as it is and it main contain bugs.
.png)
![GPT-5.1 Instant and GPT-5.1 Thinking System Card Addendum [pdf]](https://news.najib.digital/site/assets/img/broken.gif)
