Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Add missing format parsing #234

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add missing format parsing #234

wants to merge 2 commits into from

Conversation

fnadeau
Copy link

@fnadeau fnadeau commented Apr 27, 2017

atom --version
Atom : 1.18.0-dev-dc8311d
Electron: 1.3.15
Chrome : 52.0.2743.82
Node : 6.5.0

Requirements

Highlight of format specifier.

Small test that can be use for regression

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

int main ()
{
  // http://man7.org/linux/man-pages/man3/scanf.3.html
  // http://man7.org/linux/man-pages/man3/printf.3.html
  int int1, int2;
  char *str1 = NULL;
  char str2[32];
  // Field argument
  sscanf("10 20", "%2$d %1$d", &int2, &int1);
  printf("%2$*1$d\n", int1, int2); // output: "        20"
  // flags
  printf("%.*d|%#04X|%-5d|% 3d\n",
         5, int1, int2, 0x1EE1, 0xE11E); // output: "00010|0X14|7905 | 57630"
  // Only works if locale uses a char as radix character
  printf("%'.2f\n", 1234567.89);
  //width & precision
  printf( "|%-5d|%5d|\n", 1, 2);              // output: "|1    |    2|"
  printf( "|%1$*2$d|\n", 5, 3);               // output: "|  5|"
  printf("%5.2f\n", 6.28318530718);           // output: " 6.28"
  printf("%.8f\n", 1.61803398875);            // output: "1.61803399"
  printf("%1$*2$.*3$f\n", 1234567.89, 12, 1); // output: "   1234567.9"
  //length modifier
  printf("%1$hhd|%1$hd|%1$d\n", 0x2000100A);  // output: "10|4106|536875018"
  sscanf("discard:value", "%*[a-z]:%ms", &str1);
  printf( "%s\n", str1 );                     // output: "value"
  free(str1);
  str1 = NULL;

  // Bug, should not highligh the 's'
  // %ms is a valid scanf, major rewrite will be needed to fix that corner case
  printf("%ms\n"); // print value of strerror(errno) followed by s: "Successs"

  sscanf("discard]:value", "%*[^]0-9-]]:%s", str2);
  printf( "%s\n", str2 ); // output: "value"

  sscanf("discard]:value", "%m[]drcasi]:%s", &str1, str2);
  printf( "%s|%s\n", str1, str2 ); // output: "discard]|value"
  free(str1);
  str1 = NULL;

  // Should %d be highlighed here? I think not. Major rewrite required.
  puts("this is not a formated place holder %d");

  printf("%%good\n");
  //these compiles with warning but are not good practice
  printf("%#04%\n");
  printf("%wrong\n");

  return 0;
}

Description of the Change

Current implementation does not deal properly with the following:

printf format:

  • %m will print strerror(errno)

for scanf format:

  • optional 'm' can be used with string conversions
  • [ is a valid conversion specifier

literal %:

  • do not match when literal % is used with other flags

Alternate Designs

The parser could be re-written to distinguish between quoted format specifier string and quoted string. It could also distinguish between scanf and printf in order to deal with %m parsing.

The proposed solution was kept for simplicity reason.

Benefits

Proper highlight of valide format specifier

Possible Drawbacks

None

Applicable Issues

The following is not be hilighted properly:
printf("%ms");
the 's' it not part of the conversion specifier in
this case but is when used with scanf.

untitled

for printf format:
 - %m will print strerror(errno)

for scanf format:
 - optional 'm' can be used with string conversions
 - [ is a valid conversion specifier

literal %:
 - do not match when literal % is used with other
   flags

known bug: the following is not be hilighted properly:
printf("%ms");
the 's' it not part of the conversion specifier in
this case but is when used with scanf.
@fnadeau
Copy link
Author

fnadeau commented Jun 29, 2017

Who's doing the review?

grammars/c.cson Outdated
@@ -711,7 +711,8 @@
((-?\\d+)|\\*(-?\\d+\\$)?)? # minimum field width
(\\.((-?\\d+)|\\*(-?\\d+\\$)?)?)? # precision
(hh|h|ll|l|j|t|z|q|L|vh|vl|v|hv|hl)? # length modifier
[diouxXDOUeEfFgGaACcSspn%] # conversion type
((m?\\[((\\^\\]|\\])?[^\\]]+)\\])|m[sc]|[diouxXDOUeEfFgGaACcSspnm]) # conversion type
| %% # literal %
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be a single %?

Copy link
Author

@fnadeau fnadeau Jul 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you'll end up with this (line 48):
image

because of the | (or) on line 715, you need two '%'. Previously, the literal '%' was match using two: first on line 707 and second one on line 714.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the |. Thanks for explaining.

@winstliu
Copy link
Contributor

Can you please add some specs that cover these changes?

@sean-mcmanus
Copy link

What's the status of this? Someone requested this at microsoft/vscode-cpptools#1149 , so that '%[' works in fscanf.

@winstliu
Copy link
Contributor

As per my last comment, it still needs specs to be considered.

@fnadeau
Copy link
Author

fnadeau commented Oct 31, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants