Skip to content

Commit

Permalink
Implement walking of directories
Browse files Browse the repository at this point in the history
Recurse into directories using BurntSushi's walkdir crate. This means
the following is now possible:

    unicop src/

Also set the default to '.' (i.e. recursively checking the current directory).

I've read in the documentation that tree-sitter actually has a
language-detection framework that grammars can plug into, but
unfortunately, it doesn't seem to be exposed in the library.
It might still be a good idea to adopt it, as it will make it easier to
add new languages.
This commit contains a stripped-down version that only look at
file extensions (tree-sitter's language detection also uses regexps, but none
of the current grammars define any.)
  • Loading branch information
gregoire-mullvad committed Jul 24, 2024
1 parent 44310db commit 20b325e
Show file tree
Hide file tree
Showing 4 changed files with 75 additions and 19 deletions.
29 changes: 26 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ unic-ucd-block = "0.9.0"
unic-char-range = "0.9.0"
toml = "0.8.14"
serde = { version = "1.0.203", features = ["derive"] }
walkdir = "2.5.0"

[dev-dependencies]
trycmd = "0.15.5"
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Unicop

## Usage

```
unicop [FILES]...
```

Where `[FILES]...` is a list of files or directory to check, default: `.`.

## Example

```console
Expand Down
56 changes: 40 additions & 16 deletions src/main.rs
Original file line number Diff line number Diff line change
@@ -1,33 +1,41 @@
use std::env;
use std::fs;
use std::path::Path;

use miette::{miette, LabeledSpan, NamedSource, Severity};
use unic_ucd_name::Name;

mod config;
mod rules;

fn main() {
for arg in env::args().skip(1) {
check_file(&arg);
let mut args: Vec<String> = env::args().skip(1).collect();
if args.is_empty() {
args = vec![String::from(".")]
}
for arg in args.iter() {
for entry in walkdir::WalkDir::new(arg) {
match entry {
Err(err) => eprintln!("{:}", err),
Ok(entry) if entry.file_type().is_file() => check_file(entry.path()),
Ok(_) => {}
}
}
}
}

fn check_file(arg: &str) {
let src = fs::read_to_string(arg).unwrap();
let nsrc = NamedSource::new(arg, src.clone());
fn check_file(path: &Path) {
let Some(lang) = detect_language(path) else {
return;
};
let filename = path.display().to_string();
let src = fs::read_to_string(path).unwrap();
let nsrc = NamedSource::new(&filename, src.clone());
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&tree_sitter_javascript::language())
.expect("Error loading JavaScript grammar");
// parser
// .set_language(&tree_sitter_python::language())
// .expect("Error loading Python grammar");
parser.set_language(&lang).expect("Error loading grammar");
let tree = parser.parse(&src, None).unwrap();
if tree.root_node().has_error() {
println!(
"{:?}",
miette!(severity = Severity::Warning, "{}: parse error", arg).with_source_code(nsrc)
miette!(severity = Severity::Warning, "{}: parse error", filename)
.with_source_code(nsrc)
);
}
for (off, ch) in src.char_indices() {
Expand All @@ -52,7 +60,23 @@ fn check_file(arg: &str) {
chname,
node.kind()
)
.with_source_code(NamedSource::new(arg, src.clone()));
.with_source_code(NamedSource::new(&filename, src.clone()));
println!("{:?}", report);
}
}

// Tree-sitter grammars include some configurations to help decide whether the language applies to
// a given file.
// Unfortunately, neither the language-detection algorithm nor the configurations are included in
// the Rust crates. So for now we have a simplified language-detection with hard-coded
// configurations.
// See https://tree-sitter.github.io/tree-sitter/syntax-highlighting#language-detection
fn detect_language(path: &Path) -> Option<tree_sitter::Language> {
match path.extension()?.to_str()? {
// https://github.com/tree-sitter/tree-sitter-javascript/blob/master/package.json
"js" | "mjs" | "cjs" | "jsx" => Some(tree_sitter_javascript::language()),
// https://github.com/tree-sitter/tree-sitter-python/blob/master/package.json
"py" => Some(tree_sitter_python::language()),
_ => None,
}
}

0 comments on commit 20b325e

Please sign in to comment.