Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primitive functionality to display comment sections in outlines #571

Merged
merged 10 commits into from
Oct 14, 2024
Merged
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

# 0.1.9000

- Added partial support for outline headers in comments (@kv9898, posit-dev/positron#3822).

- Sending long inputs of more than 4096 bytes no longer fails (posit-dev/positron#4745).

- Jupyter: Fixed a bug in the kernel-info reply where the `pygments_lexer` field
Expand Down
2 changes: 1 addition & 1 deletion crates/ark/src/lsp/indexer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ type DocumentSymbolIndex = HashMap<DocumentSymbol, IndexEntry>;
type WorkspaceIndex = Arc<Mutex<HashMap<DocumentPath, DocumentSymbolIndex>>>;

static WORKSPACE_INDEX: LazyLock<WorkspaceIndex> = LazyLock::new(|| Default::default());
static RE_COMMENT_SECTION: LazyLock<Regex> =
pub static RE_COMMENT_SECTION: LazyLock<Regex> =
LazyLock::new(|| Regex::new(r"^\s*(#+)\s*(.*?)\s*[#=-]{4,}\s*$").unwrap());

#[tracing::instrument(level = "info", skip_all)]
Expand Down
118 changes: 114 additions & 4 deletions crates/ark/src/lsp/symbols.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
//
// symbols.rs
//
// Copyright (C) 2022 Posit Software, PBC. All rights reserved.
// Copyright (C) 2022-2024 Posit Software, PBC. All rights reserved.
//
//

Expand Down Expand Up @@ -60,7 +60,7 @@ pub fn symbols(params: &WorkspaceSymbolParams) -> anyhow::Result<Vec<SymbolInfor
IndexEntryData::Section { level: _, title } => {
info.push(SymbolInformation {
name: title.to_string(),
kind: SymbolKind::MODULE,
kind: SymbolKind::STRING,
location: Location {
uri: Url::from_file_path(path).unwrap(),
range: entry.range,
Expand Down Expand Up @@ -120,13 +120,54 @@ fn is_indexable(node: &Node) -> bool {
true
}

// Function to parse a comment and return the section level and title
fn parse_comment_as_section(comment: &str) -> Option<(usize, String)> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some tests for this function?

// Match lines starting with one or more '#' followed by some non-empty content and must end with 4 or more '-', '#', or `=`
// Ensure that there's actual content between the start and the trailing symbols.
if let Some(caps) = indexer::RE_COMMENT_SECTION.captures(comment) {
let hashes = caps.get(1)?.as_str().len(); // Count the number of '#'
let title = caps.get(2)?.as_str().trim().to_string(); // Extract the title text without trailing punctuations
return Some((hashes, title)); // Return the level based on the number of '#' and the title
}

None
}
Comment on lines +124 to +134
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the function does return the number of hashes (level of comment). I wanted this to help with the foldable comments/nested outlines. But I struggled to get this to work.


fn index_node(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are needed for this function too. You can do this to get a Rope and node:

let code = "my_code";
let document = Document::new(code, None);
let node = ast.root_node();

node: &Node,
contents: &Rope,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta comment

Looks like our indexer, which watches the whole "workspace" (i.e. multiple folders of files), also does something similar. i.e. see index_comment() there.

That indexer powers symbols() in this file, which powers workspace symbols. i.e. when you hit CMD + T then it pops up all the symbols in the workspace - even in closed files.

Screenshot 2024-10-11 at 3 17 20 PM

What we are changing here powers document symbols, i.e. symbols within a single document. This powers the Outline editor view we have been talking about, and it powers this Outline quick pick

Screenshot 2024-10-11 at 3 19 04 PM

It will eventually probably be good to merge all the infrastructure here. Ideally we'd have one set of code that we can run on a file that returns all its symbols, and then we map them into either a workspace symbol or document symbol (its essentially indexer::index_file() i think, but it doesn't know how to create nested outlines, which is important).

I don't think this should block anything, but we should keep this in mind and open an issue to merge all of this code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds very desirable! I would really appreciate the nested outline and foldable comment functionalities. Sadly, as a student primarily working on R-based data analysis projects, I'm new to Rust and unfamiliar with the ark and LSP architecture, I'm struggling to carry this out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can take care of it!

parent: &mut DocumentSymbol,
symbols: &mut Vec<DocumentSymbol>,
) -> Result<bool> {
// if we find an assignment, index it
// Check if the node is a comment and matches the markdown-style comment patterns
if node.node_type() == NodeType::Comment {
let comment_text = contents.node_slice(&node)?.to_string();

// Check if the comment starts with one or more '#' followed by any text and ends with 4+ punctuations
if let Some((_level, title)) = parse_comment_as_section(&comment_text) {
// Create a symbol based on the parsed comment
let start = convert_point_to_position(contents, node.start_position());
let end = convert_point_to_position(contents, node.end_position());

let symbol = DocumentSymbol {
name: title, // Use the title without the trailing '####' or '----'
kind: SymbolKind::STRING, // Treat it as a string section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that at the top of this file in symbols() we represent Sections with SymbolKind::MODULE.

I do think SymbolKind::STRING looks a little better, so could you please change MODULE to STRING to keep them in sync?

detail: None, // No need to display level details
children: Some(Vec::new()), // Prepare for child symbols if any
deprecated: None,
tags: None,
range: Range { start, end },
selection_range: Range { start, end },
};

// Add the symbol to the parent node
parent.children.as_mut().unwrap().push(symbol);

// Return early to avoid further processing
return Ok(true);
}
}

if matches!(
node.node_type(),
NodeType::BinaryOperator(BinaryOperatorType::LeftAssignment) |
Expand All @@ -142,7 +183,7 @@ fn index_node(
}
}

// by default, recurse into children
// Recurse into children
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if is_indexable(&child) {
Expand Down Expand Up @@ -255,3 +296,72 @@ fn index_assignment_with_function(

Ok(true)
}

#[cfg(test)]
mod tests {
use tower_lsp::lsp_types::Position;

use super::*;
use crate::lsp::documents::Document;

fn test_symbol(code: &str) -> Vec<DocumentSymbol> {
let mut symbols: Vec<DocumentSymbol> = Vec::new();

let doc = Document::new(code, None);
let node = doc.ast.root_node();

let start = convert_point_to_position(&doc.contents, node.start_position());
let end = convert_point_to_position(&doc.contents, node.end_position());

let mut root = DocumentSymbol {
name: String::from("<root>"),
kind: SymbolKind::NULL,
children: Some(Vec::new()),
deprecated: None,
tags: None,
detail: None,
range: Range { start, end },
selection_range: Range { start, end },
};

index_node(&node, &doc.contents, &mut root, &mut symbols).unwrap();
root.children.unwrap_or_default()
}

#[test]
fn test_symbol_parse_comment_as_section() {
assert_eq!(parse_comment_as_section("# foo"), None);
assert_eq!(parse_comment_as_section("# foo ---"), None);
assert_eq!(
parse_comment_as_section("# foo ----"),
Some((1, String::from("foo")))
);
}

#[test]
fn test_symbol_comment_sections() {
assert_eq!(test_symbol("# foo"), vec![]);
assert_eq!(test_symbol("# foo ---"), vec![]);

let range = Range {
start: Position {
line: 0,
character: 0,
},
end: Position {
line: 0,
character: 10,
},
};
assert_eq!(test_symbol("# foo ----"), vec![DocumentSymbol {
name: String::from("foo"),
kind: SymbolKind::STRING,
children: Some(Vec::new()),
deprecated: None,
tags: None,
detail: None,
range,
selection_range: range,
}]);
}
}
Loading