Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add code formatting target to makefile #2760

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

heeplr
Copy link
Contributor

@heeplr heeplr commented Nov 25, 2023

This PR adds 3 new targets to the Makefile:

  • format-python - that uses black to format all python code within src
  • format-cpp - that uses clang-format to format C/C++ files within src
  • format- to combine the above targets

the .clang-format config is based on @rmu75's work with minor changes and lots of disabled options with the goal to

  • invoke as little changes to the source as possible while
  • gaining readability and
  • gaining uniformity

I'd like to suggest the following steps:

  1. test the output of the formatting and adjust the config accordingly.
  2. reformat the whole codebase in one giant commit
  3. wait for dev feedback and adjust the config accordingly, if needed
  4. reformat the whole codebase again, if needed
  5. add some action to the CI that checks/adjusts/enforces formatting in future PRs

@andypugh
Copy link
Collaborator

I am not sure that I have an opinion on this.
Whilst the inconsistency in the file formatting is annoying, a massive re-format would make it practically impossible to do any meaningful code-comparison between the code before the re-format and after.
I would like to hear opinions from actual proper programmers before doing this. (I realise that this commit only makes the major change possible, it doesn't actually do it)

@heeplr
Copy link
Contributor Author

heeplr commented Dec 14, 2023

I'd also be interested in opinions from the community.
Trustwise, i'd suggest that you run the target and do the commit (when there's wide consent on everything).

a massive re-format would make it practically impossible to do any meaningful code-comparison between the code before the re-format and after.

Not impossible but it certainly adds a step. It's a price to pay and I currently don't know any way around it.
Although, I wouldn't say the step is too high and it won't be forever (when at some point really old code isn't needed to be diffed anymore).

For the meantime, there's

  • Tools like astdiff that can deal with reformatting and only show real semantic changes.
  • git diff --ignore-all-space --ignore-blank-lines ... which is useful to clean diffs a bit when no heavy reformatting took place.

@rmu75
Copy link
Contributor

rmu75 commented Dec 14, 2023

In theory, for code comparisions, it should be possible to pipe the old code through clang-format before creating the diff, but I'mt not sure that can be automated with git / github.

@smoe
Copy link
Contributor

smoe commented Dec 30, 2023

I am convinced that source code formatting is important to attract new devs and to make a good first impression on whoever is making decisions to adopt LinuxCNC. So, I would go for it. Concerning diffs, I think we need do talk with GitHub about that.

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 2, 2024 via email

@heeplr
Copy link
Contributor Author

heeplr commented Jan 2, 2024

introduce any formatting errors in small steps
do any reformattings as separate commits

What's the advantage here? I can't see any but think, that it makes a nasty process even more complicated.

once any code needing reformating was touched for other reasons.

Isn't that exactly what you want to prevent? It's very easy to confirm a commit that has only reformatting (astdiff must be empty) and you'd want a clean diff without format changes for any commit with semantic changes.

Personally I don't care about diffing very old code with properly formatted newer code. It's a solvable problem and a rare edge case.

needing reformating

I have yet to find a part of the code (.c, .cpp) that is properly formatted. I'm using vscodium and pretty much every file contains wrong/misleading indentation.
For python it's easy since there's pep8 as official formatting standard which lots of our files violate.

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 2, 2024 via email

@heeplr
Copy link
Contributor Author

heeplr commented Jan 2, 2024

One advantage is that 'git blame' and 'git log'

Yes, that's basically the problem @andypugh mentioned. It affects all reformatting that isn't white space only.

But I can't see an advantage of many small commits instead of a single big commit. It's just shifting the same problem into the future. Plus it adds logistics like "please separate your reformatting commit and your semantic change commit before submitting a PR" (how to communicate that?). And code review needs to pull out astdiff over and over again until everything is according to standard eventually.
Also style can't be enforced until this process ends eventually, so new PRs can introduce wrong formatting which starts this hellish loop again.
There are parts of the code that haven't been touched in years. They will never be reformatted then.

If the code line change anyway, it does not make much difference to 'git blame' and 'git log' if its indentation alsy change.

But that line will drown in other lines where just the formatting changed if commits are not cleanly separated.

Or are you suggesting to format just parts of a file? omg that would introduce even more mess and can't be automated. Contributors would need to do the work of clang-format manually. I really couldn't see that in practice.

Which code style do you use as your baseline for 'properly formatted'?

Short answer: "properly formatted" is no matter of choosing a style or ruleset but rather anything that's

1. enforcable in CI

and

2. consistent across the codebase.

Longer answer: Currently, the .clang-format config of this PR is based on @rmu75's work with minor changes and lots of disabled options with the goal to

  • invoke as little changes to the source as possible while
  • gaining readability and
  • gaining uniformity

I tried to stick to the clang-format defaults as close as possible with respect to the above aspects.
With black, no config is needed (I just quickly glanced at the resulting python files. I saw no problem.)
Python developers might introduce custom settings/exceptions).

The final ruleset will hopefully be a result of this discussion and there's no trouble changing it in the future and reformatting everything again. But I really don't have an opinion on the final style as long as it's consistent and enforcable.

docs/src/code/style-guide.adoc and src/CodingStyle* are slightly different, so it is a bit unclear to me what the target formatting should be.

Yeah, those would be obsolete (they obviously are already) and might be by replaced by a short note like "run make format before comitting".
If the CI would just re-format every commit, which is possible iirc, the styleguides can be removed completely as IDEs could use the .clang-format or pyproject.toml so contributors don't need to care about formatting at all (who reads styleguides anyway?).

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 2, 2024 via email

@heeplr
Copy link
Contributor Author

heeplr commented Jan 2, 2024

I have no idea how attached the project is to those style guides.

Working with the code until now gave me a pretty good idea: No one gives a darn. ;)

@smoe
Copy link
Contributor

smoe commented Jan 3, 2024

IMHO it all depends on what we want to achieve with our code. The doxygen changes (#2827) and the code formatting both help to make LinuxCNC look more professional and likely help new devs to become productive more quickly. I personally also see a certain beauty in an indentation that perfectly and reliably matches the semantics. Anything else to me is like a typo in the documentation, or worse. You just want to fix this.

Git blame and diff can work across multiple revisions, so I am not completely worried. Also, I am more of a bisect person than a diff person. And if truly interested to perform a diff against particular branch, one can still apply the formatting to that branch/revision no matter how old that branch/revision is.

I very much get that anyone deep in the development of LinuxCNC will not necessarily care about the beauty of the code. The community contributing the reformatting and an automation of that process via git hooks should be happily embraced, though.

@andypugh
Copy link
Collaborator

andypugh commented Jan 7, 2024

I haven't found much code where the intentation is wrong, as long as you experiment a bit to see what the original coder wanted to do. The main culprit here are the files that use tabs and spaces and these look wrong until you set tabs to 8 spaces.
I agree that this isn't ideal, but neither is it a huge problem, given how rarely anyone visits those corners of the codebase.

I feel that losing the "blame" view in Github would be a significant sacrifice. It's super-useful for things other that bisect can't do, like checking if changes properly propagated, and if so, which version.

@heeplr
Copy link
Contributor Author

heeplr commented Feb 13, 2024

I haven't found much code where the intentation is wrong

This is not just about indentation. Formatting also includes braces placement, (non)spacing etc.
This is to make all code look the same everywhere. Also automating it will make it future-proof in case anyone comes along, messing up the style even further.

as long as you experiment a bit to see what the original coder wanted to do.

Yeah, I don't really feel like experimenting. And I don't think anyone should be bothered with such a time waster. It's certainly not a good strategy in terms of working on the project.

The main culprit here are the files that use tabs and spaces and these look wrong until you set tabs to 8 spaces.

It's not the "main culprit". It's one flaw among many.

I agree that this isn't ideal, but neither is it a huge problem,

I never said it's a huge problem but it's probably the lowest hanging fruit on a way to a clean & modern codebase.
I just don't see why small problems should stay unsolved when they are fundamential.

given how rarely anyone visits those corners of the codebase.

Understandable. They're not very inviting.

I feel that losing the "blame" view in Github would be a significant sacrifice. It's super-useful for things other that bisect can't do, like checking if changes properly propagated, and if so, which version.

I guess that's what --ignore-rev <rev> was made for. From the git man page:

--ignore-rev <rev>
           Ignore changes made by the revision when assigning blame, as if the change never happened.
           [...]

If you care about the github webinterface, please consider this solution.

Please Remember:
You are worrying about loosing the ability to nicely read the code when doing git blame, which is an edge case.
While this whole thing is just about readability - but for ALL cases. Git blame will also look better after this.

And I really like to stress (again) how backwards the current state is.
Every major FOSS project that's old enough will go through this eventually. Most have already. (Take OpenSSL as just one example. It took them multiple big commits until things had settled.) This is a common milestone and it certainly isn't a huge problem.

IMHO this discussion should focus on "how it's done" not "if it's done". This is just my suggestion for a first draft to tackle the problem. Considering other, much larger issues in the linuxcnc codebase - which require a lot of work and a lot of important decisions - a thing like "automating code formatting (rule enforcment)" should be easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants