Coverage for C:\Repos\leo-editor\leo\core\leoAst.py: 99%

1996 statements  

« prev     ^ index     » next       coverage.py v6.4, created at 2022-05-24 10:21 -0500

1# -*- coding: utf-8 -*- 

2#@+leo-ver=5-thin 

3#@+node:ekr.20141012064706.18389: * @file leoAst.py 

4#@@first 

5# This file is part of Leo: https://leoeditor.com 

6# Leo's copyright notice is based on the MIT license: http://leoeditor.com/license.html 

7 

8# For now, suppress all mypy checks 

9# type: ignore 

10#@+<< docstring >> 

11#@+node:ekr.20200113081838.1: ** << docstring >> (leoAst.py) 

12""" 

13leoAst.py: This file does not depend on Leo in any way. 

14 

15The classes in this file unify python's token-based and ast-based worlds by 

16creating two-way links between tokens in the token list and ast nodes in 

17the parse tree. For more details, see the "Overview" section below. 

18 

19 

20**Stand-alone operation** 

21 

22usage: 

23 leoAst.py --help 

24 leoAst.py [--fstringify | --fstringify-diff | --orange | --orange-diff] PATHS 

25 leoAst.py --py-cov [ARGS] 

26 leoAst.py --pytest [ARGS] 

27 leoAst.py --unittest [ARGS] 

28 

29examples: 

30 --py-cov "-f TestOrange" 

31 --pytest "-f TestOrange" 

32 --unittest TestOrange 

33 

34positional arguments: 

35 PATHS directory or list of files 

36 

37optional arguments: 

38 -h, --help show this help message and exit 

39 --fstringify leonine fstringify 

40 --fstringify-diff show fstringify diff 

41 --orange leonine Black 

42 --orange-diff show orange diff 

43 --py-cov run pytest --cov on leoAst.py 

44 --pytest run pytest on leoAst.py 

45 --unittest run unittest on leoAst.py 

46 

47 

48**Overview** 

49 

50leoAst.py unifies python's token-oriented and ast-oriented worlds. 

51 

52leoAst.py defines classes that create two-way links between tokens 

53created by python's tokenize module and parse tree nodes created by 

54python's ast module: 

55 

56The Token Order Generator (TOG) class quickly creates the following 

57links: 

58 

59- An *ordered* children array from each ast node to its children. 

60 

61- A parent link from each ast.node to its parent. 

62 

63- Two-way links between tokens in the token list, a list of Token 

64 objects, and the ast nodes in the parse tree: 

65 

66 - For each token, token.node contains the ast.node "responsible" for 

67 the token. 

68 

69 - For each ast node, node.first_i and node.last_i are indices into 

70 the token list. These indices give the range of tokens that can be 

71 said to be "generated" by the ast node. 

72 

73Once the TOG class has inserted parent/child links, the Token Order 

74Traverser (TOT) class traverses trees annotated with parent/child 

75links extremely quickly. 

76 

77 

78**Applicability and importance** 

79 

80Many python developers will find asttokens meets all their needs. 

81asttokens is well documented and easy to use. Nevertheless, two-way 

82links are significant additions to python's tokenize and ast modules: 

83 

84- Links from tokens to nodes are assigned to the nearest possible ast 

85 node, not the nearest statement, as in asttokens. Links can easily 

86 be reassigned, if desired. 

87 

88- The TOG and TOT classes are intended to be the foundation of tools 

89 such as fstringify and black. 

90 

91- The TOG class solves real problems, such as: 

92 https://stackoverflow.com/questions/16748029/ 

93 

94**Known bug** 

95 

96This file has no known bugs *except* for Python version 3.8. 

97 

98For Python 3.8, syncing tokens will fail for function call such as: 

99 

100 f(1, x=2, *[3, 4], y=5) 

101 

102that is, for calls where keywords appear before non-keyword args. 

103 

104There are no plans to fix this bug. The workaround is to use Python version 

1053.9 or above. 

106 

107 

108**Figures of merit** 

109 

110Simplicity: The code consists primarily of a set of generators, one 

111for every kind of ast node. 

112 

113Speed: The TOG creates two-way links between tokens and ast nodes in 

114roughly the time taken by python's tokenize.tokenize and ast.parse 

115library methods. This is substantially faster than the asttokens, 

116black or fstringify tools. The TOT class traverses trees annotated 

117with parent/child links even more quickly. 

118 

119Memory: The TOG class makes no significant demands on python's 

120resources. Generators add nothing to python's call stack. 

121TOG.node_stack is the only variable-length data. This stack resides in 

122python's heap, so its length is unimportant. In the worst case, it 

123might contain a few thousand entries. The TOT class uses no 

124variable-length data at all. 

125 

126**Links** 

127 

128Leo... 

129Ask for help: https://groups.google.com/forum/#!forum/leo-editor 

130Report a bug: https://github.com/leo-editor/leo-editor/issues 

131leoAst.py docs: http://leoeditor.com/appendices.html#leoast-py 

132 

133Other tools... 

134asttokens: https://pypi.org/project/asttokens 

135black: https://pypi.org/project/black/ 

136fstringify: https://pypi.org/project/fstringify/ 

137 

138Python modules... 

139tokenize.py: https://docs.python.org/3/library/tokenize.html 

140ast.py https://docs.python.org/3/library/ast.html 

141 

142**Studying this file** 

143 

144I strongly recommend that you use Leo when studying this code so that you 

145will see the file's intended outline structure. 

146 

147Without Leo, you will see only special **sentinel comments** that create 

148Leo's outline structure. These comments have the form:: 

149 

150 `#@<comment-kind>:<user-id>.<timestamp>.<number>: <outline-level> <headline>` 

151""" 

152#@-<< docstring >> 

153#@+<< imports >> 

154#@+node:ekr.20200105054219.1: ** << imports >> (leoAst.py) 

155import argparse 

156import ast 

157import codecs 

158import difflib 

159import glob 

160import io 

161import os 

162import re 

163import sys 

164import textwrap 

165import tokenize 

166import traceback 

167from typing import Any, Callable, Dict, Generator, List, Optional, Tuple, Union 

168#@-<< imports >> 

169Node = ast.AST 

170ActionList = List[Tuple[Callable, Any]] 

171v1, v2, junk1, junk2, junk3 = sys.version_info 

172py_version = (v1, v2) 

173 

174# Async tokens exist only in Python 3.5 and 3.6. 

175# https://docs.python.org/3/library/token.html 

176has_async_tokens = (3, 5) <= py_version <= (3, 6) 

177 

178# has_position_only_params = (v1, v2) >= (3, 8) 

179#@+others 

180#@+node:ekr.20191226175251.1: ** class LeoGlobals 

181#@@nosearch 

182 

183 

184class LeoGlobals: # pragma: no cover 

185 """ 

186 Simplified version of functions in leoGlobals.py. 

187 """ 

188 

189 total_time = 0.0 # For unit testing. 

190 

191 #@+others 

192 #@+node:ekr.20191226175903.1: *3* LeoGlobals.callerName 

193 def callerName(self, n: int) -> str: 

194 """Get the function name from the call stack.""" 

195 try: 

196 f1 = sys._getframe(n) 

197 code1 = f1.f_code 

198 return code1.co_name 

199 except Exception: 

200 return '' 

201 #@+node:ekr.20191226175426.1: *3* LeoGlobals.callers 

202 def callers(self, n: int=4) -> str: 

203 """ 

204 Return a string containing a comma-separated list of the callers 

205 of the function that called g.callerList. 

206 """ 

207 i, result = 2, [] 

208 while True: 

209 s = self.callerName(n=i) 

210 if s: 

211 result.append(s) 

212 if not s or len(result) >= n: 

213 break 

214 i += 1 

215 return ','.join(reversed(result)) 

216 #@+node:ekr.20191226190709.1: *3* leoGlobals.es_exception & helper 

217 def es_exception(self, full: bool=True) -> Tuple[str, int]: 

218 typ, val, tb = sys.exc_info() 

219 for line in traceback.format_exception(typ, val, tb): 

220 print(line) 

221 fileName, n = self.getLastTracebackFileAndLineNumber() 

222 return fileName, n 

223 #@+node:ekr.20191226192030.1: *4* LeoGlobals.getLastTracebackFileAndLineNumber 

224 def getLastTracebackFileAndLineNumber(self) -> Tuple[str, int]: 

225 typ, val, tb = sys.exc_info() 

226 if typ == SyntaxError: 

227 # IndentationError is a subclass of SyntaxError. 

228 # SyntaxError *does* have 'filename' and 'lineno' attributes. 

229 return val.filename, val.lineno 

230 # 

231 # Data is a list of tuples, one per stack entry. 

232 # The tuples have the form (filename, lineNumber, functionName, text). 

233 data = traceback.extract_tb(tb) 

234 item = data[-1] # Get the item at the top of the stack. 

235 filename, n, functionName, text = item 

236 return filename, n 

237 #@+node:ekr.20200220065737.1: *3* LeoGlobals.objToString 

238 def objToString(self, obj: Any, tag: str=None) -> str: 

239 """Simplified version of g.printObj.""" 

240 result = [] 

241 if tag: 

242 result.append(f"{tag}...") 

243 if isinstance(obj, str): 

244 obj = g.splitLines(obj) 

245 if isinstance(obj, list): 

246 result.append('[') 

247 for z in obj: 

248 result.append(f" {z!r}") 

249 result.append(']') 

250 elif isinstance(obj, tuple): 

251 result.append('(') 

252 for z in obj: 

253 result.append(f" {z!r}") 

254 result.append(')') 

255 else: 

256 result.append(repr(obj)) 

257 result.append('') 

258 return '\n'.join(result) 

259 #@+node:ekr.20220327132500.1: *3* LeoGlobals.pdb 

260 def pdb(self) -> None: 

261 import pdb as _pdb 

262 # pylint: disable=forgotten-debug-statement 

263 _pdb.set_trace() 

264 #@+node:ekr.20191226190425.1: *3* LeoGlobals.plural 

265 def plural(self, obj: Any) -> str: 

266 """Return "s" or "" depending on n.""" 

267 if isinstance(obj, (list, tuple, str)): 

268 n = len(obj) 

269 else: 

270 n = obj 

271 return '' if n == 1 else 's' 

272 #@+node:ekr.20191226175441.1: *3* LeoGlobals.printObj 

273 def printObj(self, obj: Any, tag: str=None) -> None: 

274 """Simplified version of g.printObj.""" 

275 print(self.objToString(obj, tag)) 

276 #@+node:ekr.20220327120618.1: *3* LeoGlobals.shortFileName 

277 def shortFileName(self, fileName: str) -> str: 

278 """Return the base name of a path.""" 

279 return os.path.basename(fileName) if fileName else '' 

280 #@+node:ekr.20191226190131.1: *3* LeoGlobals.splitLines 

281 def splitLines(self, s: str) -> List[str]: 

282 """Split s into lines, preserving the number of lines and 

283 the endings of all lines, including the last line.""" 

284 # g.stat() 

285 if s: 

286 return s.splitlines(True) # This is a Python string function! 

287 return [] 

288 #@+node:ekr.20191226190844.1: *3* LeoGlobals.toEncodedString 

289 def toEncodedString(self, s: Any, encoding: str='utf-8') -> bytes: 

290 """Convert unicode string to an encoded string.""" 

291 if not isinstance(s, str): 

292 return s 

293 try: 

294 s = s.encode(encoding, "strict") 

295 except UnicodeError: 

296 s = s.encode(encoding, "replace") 

297 print(f"toEncodedString: Error converting {s!r} to {encoding}") 

298 return s 

299 #@+node:ekr.20191226190006.1: *3* LeoGlobals.toUnicode 

300 def toUnicode(self, s: Any, encoding: str='utf-8') -> str: 

301 """Convert bytes to unicode if necessary.""" 

302 tag = 'g.toUnicode' 

303 if isinstance(s, str): 

304 return s 

305 if not isinstance(s, bytes): 

306 print(f"{tag}: bad s: {s!r}") 

307 return '' 

308 b: bytes = s 

309 try: 

310 s2 = b.decode(encoding, 'strict') 

311 except(UnicodeDecodeError, UnicodeError): 

312 s2 = b.decode(encoding, 'replace') 

313 print(f"{tag}: unicode error. encoding: {encoding!r}, s2:\n{s2!r}") 

314 g.trace(g.callers()) 

315 except Exception: 

316 g.es_exception() 

317 print(f"{tag}: unexpected error! encoding: {encoding!r}, s2:\n{s2!r}") 

318 g.trace(g.callers()) 

319 return s2 

320 #@+node:ekr.20191226175436.1: *3* LeoGlobals.trace 

321 def trace(self, *args: Any) -> None: 

322 """Print a tracing message.""" 

323 # Compute the caller name. 

324 try: 

325 f1 = sys._getframe(1) 

326 code1 = f1.f_code 

327 name = code1.co_name 

328 except Exception: 

329 name = '' 

330 print(f"{name}: {' '.join(str(z) for z in args)}") 

331 #@+node:ekr.20191226190241.1: *3* LeoGlobals.truncate 

332 def truncate(self, s: str, n: int) -> str: 

333 """Return s truncated to n characters.""" 

334 if len(s) <= n: 

335 return s 

336 s2 = s[: n - 3] + f"...({len(s)})" 

337 return s2 + '\n' if s.endswith('\n') else s2 

338 #@-others 

339#@+node:ekr.20200702114522.1: ** leoAst.py: top-level commands 

340#@+node:ekr.20200702114557.1: *3* command: fstringify_command 

341def fstringify_command(files: List[str]) -> None: 

342 """ 

343 Entry point for --fstringify. 

344 

345 Fstringify the given file, overwriting the file. 

346 """ 

347 for filename in files: # pragma: no cover 

348 if os.path.exists(filename): 

349 print(f"fstringify {filename}") 

350 Fstringify().fstringify_file_silent(filename) 

351 else: 

352 print(f"file not found: {filename}") 

353#@+node:ekr.20200702121222.1: *3* command: fstringify_diff_command 

354def fstringify_diff_command(files: List[str]) -> None: 

355 """ 

356 Entry point for --fstringify-diff. 

357 

358 Print the diff that would be produced by fstringify. 

359 """ 

360 for filename in files: # pragma: no cover 

361 if os.path.exists(filename): 

362 print(f"fstringify-diff {filename}") 

363 Fstringify().fstringify_file_diff(filename) 

364 else: 

365 print(f"file not found: {filename}") 

366#@+node:ekr.20200702115002.1: *3* command: orange_command 

367def orange_command(files: List[str], settings: Optional[Dict[str, Any]]=None) -> None: 

368 

369 for filename in files: # pragma: no cover 

370 if os.path.exists(filename): 

371 # print(f"orange {filename}") 

372 Orange(settings).beautify_file(filename) 

373 else: 

374 print(f"file not found: {filename}") 

375 print(f"Beautify done: {len(files)} files") 

376#@+node:ekr.20200702121315.1: *3* command: orange_diff_command 

377def orange_diff_command(files: List[str], settings: Optional[Dict[str, Any]]=None) -> None: 

378 

379 for filename in files: # pragma: no cover 

380 if os.path.exists(filename): 

381 print(f"orange-diff {filename}") 

382 Orange(settings).beautify_file_diff(filename) 

383 else: 

384 print(f"file not found: {filename}") 

385#@+node:ekr.20160521104628.1: ** leoAst.py: top-level utils 

386if 1: # pragma: no cover 

387 #@+others 

388 #@+node:ekr.20200702102239.1: *3* function: main (leoAst.py) 

389 def main() -> None: 

390 """Run commands specified by sys.argv.""" 

391 args, settings_dict, arg_files, recursive = scan_ast_args() 

392 # Finalizie arguments. 

393 cwd, files = os.getcwd(), [] 

394 for path in arg_files: 

395 root_dir = os.path.join(cwd, path) 

396 files = glob.glob(f'{root_dir}**{os.sep}*.py', recursive=recursive) 

397 if not files: 

398 print('No files found') 

399 return 

400 # Execute the command. 

401 print(f"Found {len(files)} file{g.plural(len(files))}.") 

402 if args.f: 

403 fstringify_command(files) 

404 if args.fd: 

405 fstringify_diff_command(files) 

406 if args.o: 

407 orange_command(files, settings_dict) 

408 if args.od: 

409 orange_diff_command(files, settings_dict) 

410 #@+node:ekr.20220404062739.1: *3* function: scan_ast_args 

411 def scan_ast_args() -> Tuple[Any, Dict[str, Any], List[str], bool]: 

412 description = textwrap.dedent("""\ 

413 Execute fstringify or beautify commands contained in leoAst.py. 

414 """) 

415 parser = argparse.ArgumentParser( 

416 description=description, 

417 formatter_class=argparse.RawTextHelpFormatter) 

418 parser.add_argument('PATHS', nargs='*', help='directory or list of files') 

419 group = parser.add_mutually_exclusive_group(required=False) # Don't require any args. 

420 add = group.add_argument 

421 add('--fstringify', dest='f', action='store_true', 

422 help='fstringify PATHS') 

423 add('--fstringify-diff', dest='fd', action='store_true', 

424 help='fstringify diff PATHS') 

425 add('--orange', dest='o', action='store_true', 

426 help='beautify PATHS') 

427 add('--orange-diff', dest='od', action='store_true', 

428 help='diff beautify PATHS') 

429 # New arguments. 

430 add2 = parser.add_argument 

431 add2('--allow-joined', dest='allow_joined', action='store_true', 

432 help='allow joined strings') 

433 add2('--max-join', dest='max_join', metavar='N', type=int, 

434 help='max unsplit line length (default 0)') 

435 add2('--max-split', dest='max_split', metavar='N', type=int, 

436 help='max unjoined line length (default 0)') 

437 add2('--recursive', dest='recursive', action='store_true', 

438 help='include directories recursively') 

439 add2('--tab-width', dest='tab_width', metavar='N', type=int, 

440 help='tab-width (default -4)') 

441 # Create the return values, using EKR's prefs as the defaults. 

442 parser.set_defaults( 

443 allow_joined=False, 

444 max_join=0, 

445 max_split=0, 

446 recursive=False, 

447 tab_width=4, 

448 ) 

449 args = parser.parse_args() 

450 files = args.PATHS 

451 recursive = args.recursive 

452 # Create the settings dict, ensuring proper values. 

453 settings_dict: Dict[str, Any] = { 

454 'allow_joined_strings': bool(args.allow_joined), 

455 'max_join_line_length': abs(args.max_join), 

456 'max_split_line_length': abs(args.max_split), 

457 'tab_width': abs(args.tab_width), # Must be positive! 

458 } 

459 return args, settings_dict, files, recursive 

460 #@+node:ekr.20200107114409.1: *3* functions: reading & writing files 

461 #@+node:ekr.20200218071822.1: *4* function: regularize_nls 

462 def regularize_nls(s: str) -> str: 

463 """Regularize newlines within s.""" 

464 return s.replace('\r\n', '\n').replace('\r', '\n') 

465 #@+node:ekr.20200106171502.1: *4* function: get_encoding_directive 

466 # This is the pattern in PEP 263. 

467 encoding_pattern = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)') 

468 

469 def get_encoding_directive(bb: bytes) -> str: 

470 """ 

471 Get the encoding from the encoding directive at the start of a file. 

472 

473 bb: The bytes of the file. 

474 

475 Returns the codec name, or 'UTF-8'. 

476 

477 Adapted from pyzo. Copyright 2008 to 2020 by Almar Klein. 

478 """ 

479 for line in bb.split(b'\n', 2)[:2]: 

480 # Try to make line a string 

481 try: 

482 line2 = line.decode('ASCII').strip() 

483 except Exception: 

484 continue 

485 # Does the line match the PEP 263 pattern? 

486 m = encoding_pattern.match(line2) 

487 if not m: 

488 continue 

489 # Is it a known encoding? Correct the name if it is. 

490 try: 

491 c = codecs.lookup(m.group(1)) 

492 return c.name 

493 except Exception: 

494 pass 

495 return 'UTF-8' 

496 #@+node:ekr.20200103113417.1: *4* function: read_file 

497 def read_file(filename: str, encoding: str='utf-8') -> Optional[str]: 

498 """ 

499 Return the contents of the file with the given name. 

500 Print an error message and return None on error. 

501 """ 

502 tag = 'read_file' 

503 try: 

504 # Translate all newlines to '\n'. 

505 with open(filename, 'r', encoding=encoding) as f: 

506 s = f.read() 

507 return regularize_nls(s) 

508 except Exception: 

509 print(f"{tag}: can not read {filename}") 

510 return None 

511 #@+node:ekr.20200106173430.1: *4* function: read_file_with_encoding 

512 def read_file_with_encoding(filename: str) -> Tuple[str, str]: 

513 """ 

514 Read the file with the given name, returning (e, s), where: 

515 

516 s is the string, converted to unicode, or '' if there was an error. 

517 

518 e is the encoding of s, computed in the following order: 

519 

520 - The BOM encoding if the file starts with a BOM mark. 

521 - The encoding given in the # -*- coding: utf-8 -*- line. 

522 - The encoding given by the 'encoding' keyword arg. 

523 - 'utf-8'. 

524 """ 

525 # First, read the file. 

526 tag = 'read_with_encoding' 

527 try: 

528 with open(filename, 'rb') as f: 

529 bb = f.read() 

530 except Exception: 

531 print(f"{tag}: can not read {filename}") 

532 if not bb: 

533 return 'UTF-8', '' 

534 # Look for the BOM. 

535 e, bb = strip_BOM(bb) 

536 if not e: 

537 # Python's encoding comments override everything else. 

538 e = get_encoding_directive(bb) 

539 s = g.toUnicode(bb, encoding=e) 

540 s = regularize_nls(s) 

541 return e, s 

542 #@+node:ekr.20200106174158.1: *4* function: strip_BOM 

543 def strip_BOM(bb: bytes) -> Tuple[Optional[str], bytes]: 

544 """ 

545 bb must be the bytes contents of a file. 

546 

547 If bb starts with a BOM (Byte Order Mark), return (e, bb2), where: 

548 

549 - e is the encoding implied by the BOM. 

550 - bb2 is bb, stripped of the BOM. 

551 

552 If there is no BOM, return (None, bb) 

553 """ 

554 assert isinstance(bb, bytes), bb.__class__.__name__ 

555 table = ( 

556 # Test longer bom's first. 

557 (4, 'utf-32', codecs.BOM_UTF32_BE), 

558 (4, 'utf-32', codecs.BOM_UTF32_LE), 

559 (3, 'utf-8', codecs.BOM_UTF8), 

560 (2, 'utf-16', codecs.BOM_UTF16_BE), 

561 (2, 'utf-16', codecs.BOM_UTF16_LE), 

562 ) 

563 for n, e, bom in table: 

564 assert len(bom) == n 

565 if bom == bb[: len(bom)]: 

566 return e, bb[len(bom) :] 

567 return None, bb 

568 #@+node:ekr.20200103163100.1: *4* function: write_file 

569 def write_file(filename: str, s: str, encoding: str='utf-8') -> None: 

570 """ 

571 Write the string s to the file whose name is given. 

572 

573 Handle all exeptions. 

574 

575 Before calling this function, the caller should ensure 

576 that the file actually has been changed. 

577 """ 

578 try: 

579 # Write the file with platform-dependent newlines. 

580 with open(filename, 'w', encoding=encoding) as f: 

581 f.write(s) 

582 except Exception as e: 

583 g.trace(f"Error writing {filename}\n{e}") 

584 #@+node:ekr.20200113154120.1: *3* functions: tokens 

585 #@+node:ekr.20191223093539.1: *4* function: find_anchor_token 

586 def find_anchor_token(node: Node, global_token_list: List["Token"]) -> Optional["Token"]: 

587 """ 

588 Return the anchor_token for node, a token such that token.node == node. 

589 

590 The search starts at node, and then all the usual child nodes. 

591 """ 

592 

593 node1 = node 

594 

595 def anchor_token(node: Node) -> Optional["Token"]: 

596 """Return the anchor token in node.token_list""" 

597 # Careful: some tokens in the token list may have been killed. 

598 for token in get_node_token_list(node, global_token_list): 

599 if is_ancestor(node1, token): 

600 return token 

601 return None 

602 

603 # This table only has to cover fields for ast.Nodes that 

604 # won't have any associated token. 

605 

606 fields = ( 

607 # Common... 

608 'elt', 'elts', 'body', 'value', # Less common... 

609 'dims', 'ifs', 'names', 's', 

610 'test', 'values', 'targets', 

611 ) 

612 while node: 

613 # First, try the node itself. 

614 token = anchor_token(node) 

615 if token: 

616 return token 

617 # Second, try the most common nodes w/o token_lists: 

618 if isinstance(node, ast.Call): 

619 node = node.func 

620 elif isinstance(node, ast.Tuple): 

621 node = node.elts # type:ignore 

622 # Finally, try all other nodes. 

623 else: 

624 # This will be used rarely. 

625 for field in fields: 

626 node = getattr(node, field, None) 

627 if node: 

628 token = anchor_token(node) 

629 if token: 

630 return token 

631 else: 

632 break 

633 return None 

634 #@+node:ekr.20191231160225.1: *4* function: find_paren_token (changed signature) 

635 def find_paren_token(i: int, global_token_list: List["Token"]) -> int: 

636 """Return i of the next paren token, starting at tokens[i].""" 

637 while i < len(global_token_list): 

638 token = global_token_list[i] 

639 if token.kind == 'op' and token.value in '()': 

640 return i 

641 if is_significant_token(token): 

642 break 

643 i += 1 

644 return None 

645 #@+node:ekr.20200113110505.4: *4* function: get_node_tokens_list 

646 def get_node_token_list(node: Node, global_tokens_list: List["Token"]) -> List["Token"]: 

647 """ 

648 tokens_list must be the global tokens list. 

649 Return the tokens assigned to the node, or []. 

650 """ 

651 i = getattr(node, 'first_i', None) 

652 j = getattr(node, 'last_i', None) 

653 return [] if i is None else global_tokens_list[i : j + 1] 

654 #@+node:ekr.20191124123830.1: *4* function: is_significant & is_significant_token 

655 def is_significant(kind: str, value: str) -> bool: 

656 """ 

657 Return True if (kind, value) represent a token that can be used for 

658 syncing generated tokens with the token list. 

659 """ 

660 # Making 'endmarker' significant ensures that all tokens are synced. 

661 return ( 

662 kind in ('async', 'await', 'endmarker', 'name', 'number', 'string') or 

663 kind == 'op' and value not in ',;()') 

664 

665 def is_significant_token(token: "Token") -> bool: 

666 """Return True if the given token is a syncronizing token""" 

667 return is_significant(token.kind, token.value) 

668 #@+node:ekr.20191224093336.1: *4* function: match_parens 

669 def match_parens(filename: str, i: int, j: int, tokens: List["Token"]) -> int: 

670 """Match parens in tokens[i:j]. Return the new j.""" 

671 if j >= len(tokens): 

672 return len(tokens) 

673 # Calculate paren level... 

674 level = 0 

675 for n in range(i, j + 1): 

676 token = tokens[n] 

677 if token.kind == 'op' and token.value == '(': 

678 level += 1 

679 if token.kind == 'op' and token.value == ')': 

680 if level == 0: 

681 break 

682 level -= 1 

683 # Find matching ')' tokens *after* j. 

684 if level > 0: 

685 while level > 0 and j + 1 < len(tokens): 

686 token = tokens[j + 1] 

687 if token.kind == 'op' and token.value == ')': 

688 level -= 1 

689 elif token.kind == 'op' and token.value == '(': 

690 level += 1 

691 elif is_significant_token(token): 

692 break 

693 j += 1 

694 if level != 0: # pragma: no cover. 

695 line_n = tokens[i].line_number 

696 raise AssignLinksError( 

697 f"\n" 

698 f"Unmatched parens: level={level}\n" 

699 f" file: {filename}\n" 

700 f" line: {line_n}\n") 

701 return j 

702 #@+node:ekr.20191223053324.1: *4* function: tokens_for_node 

703 def tokens_for_node(filename: str, node: Node, global_token_list: List["Token"]) -> List["Token"]: 

704 """Return the list of all tokens descending from node.""" 

705 # Find any token descending from node. 

706 token = find_anchor_token(node, global_token_list) 

707 if not token: 

708 if 0: # A good trace for debugging. 

709 print('') 

710 g.trace('===== no tokens', node.__class__.__name__) 

711 return [] 

712 assert is_ancestor(node, token) 

713 # Scan backward. 

714 i = first_i = token.index 

715 while i >= 0: 

716 token2 = global_token_list[i - 1] 

717 if getattr(token2, 'node', None): 

718 if is_ancestor(node, token2): 

719 first_i = i - 1 

720 else: 

721 break 

722 i -= 1 

723 # Scan forward. 

724 j = last_j = token.index 

725 while j + 1 < len(global_token_list): 

726 token2 = global_token_list[j + 1] 

727 if getattr(token2, 'node', None): 

728 if is_ancestor(node, token2): 

729 last_j = j + 1 

730 else: 

731 break 

732 j += 1 

733 last_j = match_parens(filename, first_i, last_j, global_token_list) 

734 results = global_token_list[first_i : last_j + 1] 

735 return results 

736 #@+node:ekr.20200101030236.1: *4* function: tokens_to_string 

737 def tokens_to_string(tokens: List[Any]) -> str: 

738 """Return the string represented by the list of tokens.""" 

739 if tokens is None: 

740 # This indicates an internal error. 

741 print('') 

742 g.trace('===== token list is None ===== ') 

743 print('') 

744 return '' 

745 return ''.join([z.to_string() for z in tokens]) 

746 #@+node:ekr.20191223095408.1: *3* node/token nodes... 

747 # Functions that associate tokens with nodes. 

748 #@+node:ekr.20200120082031.1: *4* function: find_statement_node 

749 def find_statement_node(node: Node) -> Optional[Node]: 

750 """ 

751 Return the nearest statement node. 

752 Return None if node has only Module for a parent. 

753 """ 

754 if isinstance(node, ast.Module): 

755 return None 

756 parent = node 

757 while parent: 

758 if is_statement_node(parent): 

759 return parent 

760 parent = parent.parent 

761 return None 

762 #@+node:ekr.20191223054300.1: *4* function: is_ancestor 

763 def is_ancestor(node: Node, token: "Token") -> bool: 

764 """Return True if node is an ancestor of token.""" 

765 t_node = token.node 

766 if not t_node: 

767 assert token.kind == 'killed', repr(token) 

768 return False 

769 while t_node: 

770 if t_node == node: 

771 return True 

772 t_node = t_node.parent 

773 return False 

774 #@+node:ekr.20200120082300.1: *4* function: is_long_statement 

775 def is_long_statement(node: Node) -> bool: 

776 """ 

777 Return True if node is an instance of a node that might be split into 

778 shorter lines. 

779 """ 

780 return isinstance(node, ( 

781 ast.Assign, ast.AnnAssign, ast.AsyncFor, ast.AsyncWith, ast.AugAssign, 

782 ast.Call, ast.Delete, ast.ExceptHandler, ast.For, ast.Global, 

783 ast.If, ast.Import, ast.ImportFrom, 

784 ast.Nonlocal, ast.Return, ast.While, ast.With, ast.Yield, ast.YieldFrom)) 

785 #@+node:ekr.20200120110005.1: *4* function: is_statement_node 

786 def is_statement_node(node: Node) -> bool: 

787 """Return True if node is a top-level statement.""" 

788 return is_long_statement(node) or isinstance(node, ( 

789 ast.Break, ast.Continue, ast.Pass, ast.Try)) 

790 #@+node:ekr.20191231082137.1: *4* function: nearest_common_ancestor 

791 def nearest_common_ancestor(node1: Node, node2: Node) -> Optional[Node]: 

792 """ 

793 Return the nearest common ancestor node for the given nodes. 

794 

795 The nodes must have parent links. 

796 """ 

797 

798 def parents(node: Node) -> List[Node]: 

799 aList = [] 

800 while node: 

801 aList.append(node) 

802 node = node.parent 

803 return list(reversed(aList)) 

804 

805 result = None 

806 parents1 = parents(node1) 

807 parents2 = parents(node2) 

808 while parents1 and parents2: 

809 parent1 = parents1.pop(0) 

810 parent2 = parents2.pop(0) 

811 if parent1 == parent2: 

812 result = parent1 

813 else: 

814 break 

815 return result 

816 #@+node:ekr.20191231072039.1: *3* functions: utils... 

817 # General utility functions on tokens and nodes. 

818 #@+node:ekr.20191119085222.1: *4* function: obj_id 

819 def obj_id(obj: Any) -> str: 

820 """Return the last four digits of id(obj), for dumps & traces.""" 

821 return str(id(obj))[-4:] 

822 #@+node:ekr.20191231060700.1: *4* function: op_name 

823 #@@nobeautify 

824 

825 # https://docs.python.org/3/library/ast.html 

826 

827 _op_names = { 

828 # Binary operators. 

829 'Add': '+', 

830 'BitAnd': '&', 

831 'BitOr': '|', 

832 'BitXor': '^', 

833 'Div': '/', 

834 'FloorDiv': '//', 

835 'LShift': '<<', 

836 'MatMult': '@', # Python 3.5. 

837 'Mod': '%', 

838 'Mult': '*', 

839 'Pow': '**', 

840 'RShift': '>>', 

841 'Sub': '-', 

842 # Boolean operators. 

843 'And': ' and ', 

844 'Or': ' or ', 

845 # Comparison operators 

846 'Eq': '==', 

847 'Gt': '>', 

848 'GtE': '>=', 

849 'In': ' in ', 

850 'Is': ' is ', 

851 'IsNot': ' is not ', 

852 'Lt': '<', 

853 'LtE': '<=', 

854 'NotEq': '!=', 

855 'NotIn': ' not in ', 

856 # Context operators. 

857 'AugLoad': '<AugLoad>', 

858 'AugStore': '<AugStore>', 

859 'Del': '<Del>', 

860 'Load': '<Load>', 

861 'Param': '<Param>', 

862 'Store': '<Store>', 

863 # Unary operators. 

864 'Invert': '~', 

865 'Not': ' not ', 

866 'UAdd': '+', 

867 'USub': '-', 

868 } 

869 

870 def op_name(node: Node) -> str: 

871 """Return the print name of an operator node.""" 

872 class_name = node.__class__.__name__ 

873 assert class_name in _op_names, repr(class_name) 

874 return _op_names[class_name].strip() 

875 #@+node:ekr.20200107114452.1: *3* node/token creators... 

876 #@+node:ekr.20200103082049.1: *4* function: make_tokens 

877 def make_tokens(contents: str) -> List["Token"]: 

878 """ 

879 Return a list (not a generator) of Token objects corresponding to the 

880 list of 5-tuples generated by tokenize.tokenize. 

881 

882 Perform consistency checks and handle all exeptions. 

883 """ 

884 

885 def check(contents: str, tokens: List["Token"]) -> bool: 

886 result = tokens_to_string(tokens) 

887 ok = result == contents 

888 if not ok: 

889 print('\nRound-trip check FAILS') 

890 print('Contents...\n') 

891 g.printObj(contents) 

892 print('\nResult...\n') 

893 g.printObj(result) 

894 return ok 

895 

896 try: 

897 five_tuples = tokenize.tokenize( 

898 io.BytesIO(contents.encode('utf-8')).readline) 

899 except Exception: 

900 print('make_tokens: exception in tokenize.tokenize') 

901 g.es_exception() 

902 return None 

903 tokens = Tokenizer().create_input_tokens(contents, five_tuples) 

904 assert check(contents, tokens) 

905 return tokens 

906 #@+node:ekr.20191027075648.1: *4* function: parse_ast 

907 def parse_ast(s: str) -> Optional[Node]: 

908 """ 

909 Parse string s, catching & reporting all exceptions. 

910 Return the ast node, or None. 

911 """ 

912 

913 def oops(message: str) -> None: 

914 print('') 

915 print(f"parse_ast: {message}") 

916 g.printObj(s) 

917 print('') 

918 

919 try: 

920 s1 = g.toEncodedString(s) 

921 tree = ast.parse(s1, filename='before', mode='exec') 

922 return tree 

923 except IndentationError: 

924 oops('Indentation Error') 

925 except SyntaxError: 

926 oops('Syntax Error') 

927 except Exception: 

928 oops('Unexpected Exception') 

929 g.es_exception() 

930 return None 

931 #@+node:ekr.20191231110051.1: *3* node/token dumpers... 

932 #@+node:ekr.20191027074436.1: *4* function: dump_ast 

933 def dump_ast(ast: Node, tag: str='dump_ast') -> None: 

934 """Utility to dump an ast tree.""" 

935 g.printObj(AstDumper().dump_ast(ast), tag=tag) 

936 #@+node:ekr.20191228095945.4: *4* function: dump_contents 

937 def dump_contents(contents: str, tag: str='Contents') -> None: 

938 print('') 

939 print(f"{tag}...\n") 

940 for i, z in enumerate(g.splitLines(contents)): 

941 print(f"{i+1:<3} ", z.rstrip()) 

942 print('') 

943 #@+node:ekr.20191228095945.5: *4* function: dump_lines 

944 def dump_lines(tokens: List["Token"], tag: str='Token lines') -> None: 

945 print('') 

946 print(f"{tag}...\n") 

947 for z in tokens: 

948 if z.line.strip(): 

949 print(z.line.rstrip()) 

950 else: 

951 print(repr(z.line)) 

952 print('') 

953 #@+node:ekr.20191228095945.7: *4* function: dump_results 

954 def dump_results(tokens: List["Token"], tag: str='Results') -> None: 

955 print('') 

956 print(f"{tag}...\n") 

957 print(tokens_to_string(tokens)) 

958 print('') 

959 #@+node:ekr.20191228095945.8: *4* function: dump_tokens 

960 def dump_tokens(tokens: List["Token"], tag: str='Tokens') -> None: 

961 print('') 

962 print(f"{tag}...\n") 

963 if not tokens: 

964 return 

965 print("Note: values shown are repr(value) *except* for 'string' tokens.") 

966 tokens[0].dump_header() 

967 for i, z in enumerate(tokens): 

968 # Confusing. 

969 # if (i % 20) == 0: z.dump_header() 

970 print(z.dump()) 

971 print('') 

972 #@+node:ekr.20191228095945.9: *4* function: dump_tree 

973 def dump_tree(tokens: List["Token"], tree: Node, tag: str='Tree') -> None: 

974 print('') 

975 print(f"{tag}...\n") 

976 print(AstDumper().dump_tree(tokens, tree)) 

977 #@+node:ekr.20200107040729.1: *4* function: show_diffs 

978 def show_diffs(s1: str, s2: str, filename: str='') -> None: 

979 """Print diffs between strings s1 and s2.""" 

980 lines = list(difflib.unified_diff( 

981 g.splitLines(s1), 

982 g.splitLines(s2), 

983 fromfile=f"Old {filename}", 

984 tofile=f"New {filename}", 

985 )) 

986 print('') 

987 tag = f"Diffs for {filename}" if filename else 'Diffs' 

988 g.printObj(lines, tag=tag) 

989 #@+node:ekr.20191225061516.1: *3* node/token replacers... 

990 # Functions that replace tokens or nodes. 

991 #@+node:ekr.20191231162249.1: *4* function: add_token_to_token_list 

992 def add_token_to_token_list(token: "Token", node: Node) -> None: 

993 """Insert token in the proper location of node.token_list.""" 

994 if getattr(node, 'first_i', None) is None: 

995 node.first_i = node.last_i = token.index 

996 else: 

997 node.first_i = min(node.first_i, token.index) 

998 node.last_i = max(node.last_i, token.index) 

999 #@+node:ekr.20191225055616.1: *4* function: replace_node 

1000 def replace_node(new_node: Node, old_node: Node) -> None: 

1001 """Replace new_node by old_node in the parse tree.""" 

1002 parent = old_node.parent 

1003 new_node.parent = parent 

1004 new_node.node_index = old_node.node_index 

1005 children = parent.children 

1006 i = children.index(old_node) 

1007 children[i] = new_node 

1008 fields = getattr(old_node, '_fields', None) 

1009 if fields: 

1010 for field in fields: 

1011 field = getattr(old_node, field) 

1012 if field == old_node: 

1013 setattr(old_node, field, new_node) 

1014 break 

1015 #@+node:ekr.20191225055626.1: *4* function: replace_token 

1016 def replace_token(token: "Token", kind: str, value: str) -> None: 

1017 """Replace kind and value of the given token.""" 

1018 if token.kind in ('endmarker', 'killed'): 

1019 return 

1020 token.kind = kind 

1021 token.value = value 

1022 token.node = None # Should be filled later. 

1023 #@-others 

1024#@+node:ekr.20191027072910.1: ** Exception classes 

1025class AssignLinksError(Exception): 

1026 """Assigning links to ast nodes failed.""" 

1027 

1028 

1029class AstNotEqual(Exception): 

1030 """The two given AST's are not equivalent.""" 

1031 

1032class BeautifyError(Exception): 

1033 """Leading tabs found.""" 

1034 

1035 

1036class FailFast(Exception): 

1037 """Abort tests in TestRunner class.""" 

1038#@+node:ekr.20220402062255.1: ** Classes 

1039#@+node:ekr.20141012064706.18390: *3* class AstDumper 

1040class AstDumper: # pragma: no cover 

1041 """A class supporting various kinds of dumps of ast nodes.""" 

1042 #@+others 

1043 #@+node:ekr.20191112033445.1: *4* dumper.dump_tree & helper 

1044 def dump_tree(self, tokens: List["Token"], tree: Node) -> str: 

1045 """Briefly show a tree, properly indented.""" 

1046 self.tokens = tokens 

1047 result = [self.show_header()] 

1048 self.dump_tree_and_links_helper(tree, 0, result) 

1049 return ''.join(result) 

1050 #@+node:ekr.20191125035321.1: *5* dumper.dump_tree_and_links_helper 

1051 def dump_tree_and_links_helper(self, node: Node, level: int, result: List[str]) -> None: 

1052 """Return the list of lines in result.""" 

1053 if node is None: 

1054 return 

1055 # Let block. 

1056 indent = ' ' * 2 * level 

1057 children: List[ast.AST] = getattr(node, 'children', []) 

1058 node_s = self.compute_node_string(node, level) 

1059 # Dump... 

1060 if isinstance(node, (list, tuple)): 

1061 for z in node: 

1062 self.dump_tree_and_links_helper(z, level, result) 

1063 elif isinstance(node, str): 

1064 result.append(f"{indent}{node.__class__.__name__:>8}:{node}\n") 

1065 elif isinstance(node, ast.AST): 

1066 # Node and parent. 

1067 result.append(node_s) 

1068 # Children. 

1069 for z in children: 

1070 self.dump_tree_and_links_helper(z, level + 1, result) 

1071 else: 

1072 result.append(node_s) 

1073 #@+node:ekr.20191125035600.1: *4* dumper.compute_node_string & helpers 

1074 def compute_node_string(self, node: Node, level: int) -> str: 

1075 """Return a string summarizing the node.""" 

1076 indent = ' ' * 2 * level 

1077 parent = getattr(node, 'parent', None) 

1078 node_id = getattr(node, 'node_index', '??') 

1079 parent_id = getattr(parent, 'node_index', '??') 

1080 parent_s = f"{parent_id:>3}.{parent.__class__.__name__} " if parent else '' 

1081 class_name = node.__class__.__name__ 

1082 descriptor_s = f"{node_id}.{class_name}: " + self.show_fields( 

1083 class_name, node, 30) 

1084 tokens_s = self.show_tokens(node, 70, 100) 

1085 lines = self.show_line_range(node) 

1086 full_s1 = f"{parent_s:<16} {lines:<10} {indent}{descriptor_s} " 

1087 node_s = f"{full_s1:<62} {tokens_s}\n" 

1088 return node_s 

1089 #@+node:ekr.20191113223424.1: *5* dumper.show_fields 

1090 def show_fields(self, class_name: str, node: Node, truncate_n: int) -> str: 

1091 """Return a string showing interesting fields of the node.""" 

1092 val = '' 

1093 if class_name == 'JoinedStr': 

1094 values = node.values 

1095 assert isinstance(values, list) 

1096 # Str tokens may represent *concatenated* strings. 

1097 results = [] 

1098 fstrings, strings = 0, 0 

1099 for z in values: 

1100 assert isinstance(z, (ast.FormattedValue, ast.Str)) 

1101 if isinstance(z, ast.Str): 

1102 results.append(z.s) 

1103 strings += 1 

1104 else: 

1105 results.append(z.__class__.__name__) 

1106 fstrings += 1 

1107 val = f"{strings} str, {fstrings} f-str" 

1108 elif class_name == 'keyword': 

1109 if isinstance(node.value, ast.Str): 

1110 val = f"arg={node.arg}..Str.value.s={node.value.s}" 

1111 elif isinstance(node.value, ast.Name): 

1112 val = f"arg={node.arg}..Name.value.id={node.value.id}" 

1113 else: 

1114 val = f"arg={node.arg}..value={node.value.__class__.__name__}" 

1115 elif class_name == 'Name': 

1116 val = f"id={node.id!r}" 

1117 elif class_name == 'NameConstant': 

1118 val = f"value={node.value!r}" 

1119 elif class_name == 'Num': 

1120 val = f"n={node.n}" 

1121 elif class_name == 'Starred': 

1122 if isinstance(node.value, ast.Str): 

1123 val = f"s={node.value.s}" 

1124 elif isinstance(node.value, ast.Name): 

1125 val = f"id={node.value.id}" 

1126 else: 

1127 val = f"s={node.value.__class__.__name__}" 

1128 elif class_name == 'Str': 

1129 val = f"s={node.s!r}" 

1130 elif class_name in ('AugAssign', 'BinOp', 'BoolOp', 'UnaryOp'): # IfExp 

1131 name = node.op.__class__.__name__ 

1132 val = f"op={_op_names.get(name, name)}" 

1133 elif class_name == 'Compare': 

1134 ops = ','.join([op_name(z) for z in node.ops]) 

1135 val = f"ops='{ops}'" 

1136 else: 

1137 val = '' 

1138 return g.truncate(val, truncate_n) 

1139 #@+node:ekr.20191114054726.1: *5* dumper.show_line_range 

1140 def show_line_range(self, node: Node) -> str: 

1141 

1142 token_list = get_node_token_list(node, self.tokens) 

1143 if not token_list: 

1144 return '' 

1145 min_ = min([z.line_number for z in token_list]) 

1146 max_ = max([z.line_number for z in token_list]) 

1147 return f"{min_}" if min_ == max_ else f"{min_}..{max_}" 

1148 #@+node:ekr.20191113223425.1: *5* dumper.show_tokens 

1149 def show_tokens(self, node: Node, n: int, m: int, show_cruft: bool=False) -> str: 

1150 """ 

1151 Return a string showing node.token_list. 

1152 

1153 Split the result if n + len(result) > m 

1154 """ 

1155 token_list = get_node_token_list(node, self.tokens) 

1156 result = [] 

1157 for z in token_list: 

1158 val = None 

1159 if z.kind == 'comment': 

1160 if show_cruft: 

1161 val = g.truncate(z.value, 10) # Short is good. 

1162 result.append(f"{z.kind}.{z.index}({val})") 

1163 elif z.kind == 'name': 

1164 val = g.truncate(z.value, 20) 

1165 result.append(f"{z.kind}.{z.index}({val})") 

1166 elif z.kind == 'newline': 

1167 # result.append(f"{z.kind}.{z.index}({z.line_number}:{len(z.line)})") 

1168 result.append(f"{z.kind}.{z.index}") 

1169 elif z.kind == 'number': 

1170 result.append(f"{z.kind}.{z.index}({z.value})") 

1171 elif z.kind == 'op': 

1172 if z.value not in ',()' or show_cruft: 

1173 result.append(f"{z.kind}.{z.index}({z.value})") 

1174 elif z.kind == 'string': 

1175 val = g.truncate(z.value, 30) 

1176 result.append(f"{z.kind}.{z.index}({val})") 

1177 elif z.kind == 'ws': 

1178 if show_cruft: 

1179 result.append(f"{z.kind}.{z.index}({len(z.value)})") 

1180 else: 

1181 # Indent, dedent, encoding, etc. 

1182 # Don't put a blank. 

1183 continue 

1184 if result and result[-1] != ' ': 

1185 result.append(' ') 

1186 # 

1187 # split the line if it is too long. 

1188 # g.printObj(result, tag='show_tokens') 

1189 if 1: 

1190 return ''.join(result) 

1191 line, lines = [], [] 

1192 for r in result: 

1193 line.append(r) 

1194 if n + len(''.join(line)) >= m: 

1195 lines.append(''.join(line)) 

1196 line = [] 

1197 lines.append(''.join(line)) 

1198 pad = '\n' + ' ' * n 

1199 return pad.join(lines) 

1200 #@+node:ekr.20191110165235.5: *4* dumper.show_header 

1201 def show_header(self) -> str: 

1202 """Return a header string, but only the fist time.""" 

1203 return ( 

1204 f"{'parent':<16} {'lines':<10} {'node':<34} {'tokens'}\n" 

1205 f"{'======':<16} {'=====':<10} {'====':<34} {'======'}\n") 

1206 #@+node:ekr.20141012064706.18392: *4* dumper.dump_ast & helper 

1207 annotate_fields = False 

1208 include_attributes = False 

1209 indent_ws = ' ' 

1210 

1211 def dump_ast(self, node: Node, level: int=0) -> str: 

1212 """ 

1213 Dump an ast tree. Adapted from ast.dump. 

1214 """ 

1215 sep1 = '\n%s' % (self.indent_ws * (level + 1)) 

1216 if isinstance(node, ast.AST): 

1217 fields = [(a, self.dump_ast(b, level + 1)) for a, b in self.get_fields(node)] 

1218 if self.include_attributes and node._attributes: 

1219 fields.extend([(a, self.dump_ast(getattr(node, a), level + 1)) 

1220 for a in node._attributes]) 

1221 if self.annotate_fields: 

1222 aList = ['%s=%s' % (a, b) for a, b in fields] 

1223 else: 

1224 aList = [b for a, b in fields] 

1225 name = node.__class__.__name__ 

1226 sep = '' if len(aList) <= 1 else sep1 

1227 return '%s(%s%s)' % (name, sep, sep1.join(aList)) 

1228 if isinstance(node, list): 

1229 sep = sep1 

1230 return 'LIST[%s]' % ''.join( 

1231 ['%s%s' % (sep, self.dump_ast(z, level + 1)) for z in node]) 

1232 return repr(node) 

1233 #@+node:ekr.20141012064706.18393: *5* dumper.get_fields 

1234 def get_fields(self, node: Node) -> Generator: 

1235 

1236 return ( 

1237 (a, b) for a, b in ast.iter_fields(node) 

1238 if a not in ['ctx',] and b not in (None, []) 

1239 ) 

1240 #@-others 

1241#@+node:ekr.20191222083453.1: *3* class Fstringify 

1242class Fstringify: 

1243 """A class to fstringify files.""" 

1244 

1245 silent = True # for pytest. Defined in all entries. 

1246 line_number = 0 

1247 line = '' 

1248 

1249 #@+others 

1250 #@+node:ekr.20191222083947.1: *4* fs.fstringify 

1251 def fstringify(self, contents: str, filename: str, tokens: List["Token"], tree: Node) -> str: 

1252 """ 

1253 Fstringify.fstringify: 

1254 

1255 f-stringify the sources given by (tokens, tree). 

1256 

1257 Return the resulting string. 

1258 """ 

1259 self.filename = filename 

1260 self.tokens = tokens 

1261 self.tree = tree 

1262 # Prepass: reassign tokens. 

1263 ReassignTokens().reassign(filename, tokens, tree) 

1264 # Main pass. 

1265 for node in ast.walk(tree): 

1266 if ( 

1267 isinstance(node, ast.BinOp) 

1268 and op_name(node.op) == '%' 

1269 and isinstance(node.left, ast.Str) 

1270 ): 

1271 self.make_fstring(node) 

1272 results = tokens_to_string(self.tokens) 

1273 return results 

1274 #@+node:ekr.20200103054101.1: *4* fs.fstringify_file (entry) 

1275 def fstringify_file(self, filename: str) -> bool: # pragma: no cover 

1276 """ 

1277 Fstringify.fstringify_file. 

1278 

1279 The entry point for the fstringify-file command. 

1280 

1281 f-stringify the given external file with the Fstrinfify class. 

1282 

1283 Return True if the file was changed. 

1284 """ 

1285 tag = 'fstringify-file' 

1286 self.filename = filename 

1287 self.silent = False 

1288 tog = TokenOrderGenerator() 

1289 try: 

1290 contents, encoding, tokens, tree = tog.init_from_file(filename) 

1291 if not contents or not tokens or not tree: 

1292 print(f"{tag}: Can not fstringify: {filename}") 

1293 return False 

1294 results = self.fstringify(contents, filename, tokens, tree) 

1295 except Exception as e: 

1296 print(e) 

1297 return False 

1298 # Something besides newlines must change. 

1299 changed = regularize_nls(contents) != regularize_nls(results) 

1300 status = 'Wrote' if changed else 'Unchanged' 

1301 print(f"{tag}: {status:>9}: {filename}") 

1302 if changed: 

1303 write_file(filename, results, encoding=encoding) 

1304 return changed 

1305 #@+node:ekr.20200103065728.1: *4* fs.fstringify_file_diff (entry) 

1306 def fstringify_file_diff(self, filename: str) -> bool: # pragma: no cover 

1307 """ 

1308 Fstringify.fstringify_file_diff. 

1309 

1310 The entry point for the diff-fstringify-file command. 

1311 

1312 Print the diffs that would resulf from the fstringify-file command. 

1313 

1314 Return True if the file would be changed. 

1315 """ 

1316 tag = 'diff-fstringify-file' 

1317 self.filename = filename 

1318 self.silent = False 

1319 tog = TokenOrderGenerator() 

1320 try: 

1321 contents, encoding, tokens, tree = tog.init_from_file(filename) 

1322 if not contents or not tokens or not tree: 

1323 return False 

1324 results = self.fstringify(contents, filename, tokens, tree) 

1325 except Exception as e: 

1326 print(e) 

1327 return False 

1328 # Something besides newlines must change. 

1329 changed = regularize_nls(contents) != regularize_nls(results) 

1330 if changed: 

1331 show_diffs(contents, results, filename=filename) 

1332 else: 

1333 print(f"{tag}: Unchanged: {filename}") 

1334 return changed 

1335 #@+node:ekr.20200112060218.1: *4* fs.fstringify_file_silent (entry) 

1336 def fstringify_file_silent(self, filename: str) -> bool: # pragma: no cover 

1337 """ 

1338 Fstringify.fstringify_file_silent. 

1339 

1340 The entry point for the silent-fstringify-file command. 

1341 

1342 fstringify the given file, suppressing all but serious error messages. 

1343 

1344 Return True if the file would be changed. 

1345 """ 

1346 self.filename = filename 

1347 self.silent = True 

1348 tog = TokenOrderGenerator() 

1349 try: 

1350 contents, encoding, tokens, tree = tog.init_from_file(filename) 

1351 if not contents or not tokens or not tree: 

1352 return False 

1353 results = self.fstringify(contents, filename, tokens, tree) 

1354 except Exception as e: 

1355 print(e) 

1356 return False 

1357 # Something besides newlines must change. 

1358 changed = regularize_nls(contents) != regularize_nls(results) 

1359 status = 'Wrote' if changed else 'Unchanged' 

1360 # Write the results. 

1361 print(f"{status:>9}: {filename}") 

1362 if changed: 

1363 write_file(filename, results, encoding=encoding) 

1364 return changed 

1365 #@+node:ekr.20191222095754.1: *4* fs.make_fstring & helpers 

1366 def make_fstring(self, node: Node) -> None: 

1367 """ 

1368 node is BinOp node representing an '%' operator. 

1369 node.left is an ast.Str node. 

1370 node.right reprsents the RHS of the '%' operator. 

1371 

1372 Convert this tree to an f-string, if possible. 

1373 Replace the node's entire tree with a new ast.Str node. 

1374 Replace all the relevant tokens with a single new 'string' token. 

1375 """ 

1376 trace = False 

1377 assert isinstance(node.left, ast.Str), (repr(node.left), g.callers()) 

1378 # Careful: use the tokens, not Str.s. This preserves spelling. 

1379 lt_token_list = get_node_token_list(node.left, self.tokens) 

1380 if not lt_token_list: # pragma: no cover 

1381 print('') 

1382 g.trace('Error: no token list in Str') 

1383 dump_tree(self.tokens, node) 

1384 print('') 

1385 return 

1386 lt_s = tokens_to_string(lt_token_list) 

1387 if trace: 

1388 g.trace('lt_s:', lt_s) # pragma: no cover 

1389 # Get the RHS values, a list of token lists. 

1390 values = self.scan_rhs(node.right) 

1391 if trace: # pragma: no cover 

1392 for i, z in enumerate(values): 

1393 dump_tokens(z, tag=f"RHS value {i}") 

1394 # Compute rt_s, self.line and self.line_number for later messages. 

1395 token0 = lt_token_list[0] 

1396 self.line_number = token0.line_number 

1397 self.line = token0.line.strip() 

1398 rt_s = ''.join(tokens_to_string(z) for z in values) 

1399 # Get the % specs in the LHS string. 

1400 specs = self.scan_format_string(lt_s) 

1401 if len(values) != len(specs): # pragma: no cover 

1402 self.message( 

1403 f"can't create f-fstring: {lt_s!r}\n" 

1404 f":f-string mismatch: " 

1405 f"{len(values)} value{g.plural(len(values))}, " 

1406 f"{len(specs)} spec{g.plural(len(specs))}") 

1407 return 

1408 # Replace specs with values. 

1409 results = self.substitute_values(lt_s, specs, values) 

1410 result = self.compute_result(lt_s, results) 

1411 if not result: 

1412 return 

1413 # Remove whitespace before ! and :. 

1414 result = self.clean_ws(result) 

1415 # Show the results 

1416 if trace: # pragma: no cover 

1417 before = (lt_s + ' % ' + rt_s).replace('\n', '<NL>') 

1418 after = result.replace('\n', '<NL>') 

1419 self.message( 

1420 f"trace:\n" 

1421 f":from: {before!s}\n" 

1422 f": to: {after!s}") 

1423 # Adjust the tree and the token list. 

1424 self.replace(node, result, values) 

1425 #@+node:ekr.20191222102831.3: *5* fs.clean_ws 

1426 ws_pat = re.compile(r'(\s+)([:!][0-9]\})') 

1427 

1428 def clean_ws(self, s: str) -> str: 

1429 """Carefully remove whitespace before ! and : specifiers.""" 

1430 s = re.sub(self.ws_pat, r'\2', s) 

1431 return s 

1432 #@+node:ekr.20191222102831.4: *5* fs.compute_result & helpers 

1433 def compute_result(self, lt_s: str, tokens: List["Token"]) -> str: 

1434 """ 

1435 Create the final result, with various kinds of munges. 

1436 

1437 Return the result string, or None if there are errors. 

1438 """ 

1439 # Fail if there is a backslash within { and }. 

1440 if not self.check_back_slashes(lt_s, tokens): 

1441 return None # pragma: no cover 

1442 # Ensure consistent quotes. 

1443 if not self.change_quotes(lt_s, tokens): 

1444 return None # pragma: no cover 

1445 return tokens_to_string(tokens) 

1446 #@+node:ekr.20200215074309.1: *6* fs.check_back_slashes 

1447 def check_back_slashes(self, lt_s: str, tokens: List["Token"]) -> bool: 

1448 """ 

1449 Return False if any backslash appears with an {} expression. 

1450 

1451 Tokens is a list of lokens on the RHS. 

1452 """ 

1453 count = 0 

1454 for z in tokens: 

1455 if z.kind == 'op': 

1456 if z.value == '{': 

1457 count += 1 

1458 elif z.value == '}': 

1459 count -= 1 

1460 if (count % 2) == 1 and '\\' in z.value: 

1461 if not self.silent: 

1462 self.message( # pragma: no cover (silent during unit tests) 

1463 f"can't create f-fstring: {lt_s!r}\n" 

1464 f":backslash in {{expr}}:") 

1465 return False 

1466 return True 

1467 #@+node:ekr.20191222102831.7: *6* fs.change_quotes 

1468 def change_quotes(self, lt_s: str, aList: List[Any]) -> bool: 

1469 """ 

1470 Carefully check quotes in all "inner" tokens as necessary. 

1471 

1472 Return False if the f-string would contain backslashes. 

1473 

1474 We expect the following "outer" tokens. 

1475 

1476 aList[0]: ('string', 'f') 

1477 aList[1]: ('string', a single or double quote. 

1478 aList[-1]: ('string', a single or double quote matching aList[1]) 

1479 """ 

1480 # Sanity checks. 

1481 if len(aList) < 4: 

1482 return True # pragma: no cover (defensive) 

1483 if not lt_s: # pragma: no cover (defensive) 

1484 self.message("can't create f-fstring: no lt_s!") 

1485 return False 

1486 delim = lt_s[0] 

1487 # Check tokens 0, 1 and -1. 

1488 token0 = aList[0] 

1489 token1 = aList[1] 

1490 token_last = aList[-1] 

1491 for token in token0, token1, token_last: 

1492 # These are the only kinds of tokens we expect to generate. 

1493 ok = ( 

1494 token.kind == 'string' or 

1495 token.kind == 'op' and token.value in '{}') 

1496 if not ok: # pragma: no cover (defensive) 

1497 self.message( 

1498 f"unexpected token: {token.kind} {token.value}\n" 

1499 f": lt_s: {lt_s!r}") 

1500 return False 

1501 # These checks are important... 

1502 if token0.value != 'f': 

1503 return False # pragma: no cover (defensive) 

1504 val1 = token1.value 

1505 if delim != val1: 

1506 return False # pragma: no cover (defensive) 

1507 val_last = token_last.value 

1508 if delim != val_last: 

1509 return False # pragma: no cover (defensive) 

1510 # 

1511 # Check for conflicting delims, preferring f"..." to f'...'. 

1512 for delim in ('"', "'"): 

1513 aList[1] = aList[-1] = Token('string', delim) 

1514 for z in aList[2:-1]: 

1515 if delim in z.value: 

1516 break 

1517 else: 

1518 return True 

1519 if not self.silent: # pragma: no cover (silent unit test) 

1520 self.message( 

1521 f"can't create f-fstring: {lt_s!r}\n" 

1522 f": conflicting delims:") 

1523 return False 

1524 #@+node:ekr.20191222102831.6: *5* fs.munge_spec 

1525 def munge_spec(self, spec: str) -> Tuple[str, str]: 

1526 """ 

1527 Return (head, tail). 

1528 

1529 The format is spec !head:tail or :tail 

1530 

1531 Example specs: s2, r3 

1532 """ 

1533 # To do: handle more specs. 

1534 head, tail = [], [] 

1535 if spec.startswith('+'): 

1536 pass # Leave it alone! 

1537 elif spec.startswith('-'): 

1538 tail.append('>') 

1539 spec = spec[1:] 

1540 if spec.endswith('s'): 

1541 spec = spec[:-1] 

1542 if spec.endswith('r'): 

1543 head.append('r') 

1544 spec = spec[:-1] 

1545 tail_s = ''.join(tail) + spec 

1546 head_s = ''.join(head) 

1547 return head_s, tail_s 

1548 #@+node:ekr.20191222102831.9: *5* fs.scan_format_string 

1549 # format_spec ::= [[fill]align][sign][#][0][width][,][.precision][type] 

1550 # fill ::= <any character> 

1551 # align ::= "<" | ">" | "=" | "^" 

1552 # sign ::= "+" | "-" | " " 

1553 # width ::= integer 

1554 # precision ::= integer 

1555 # type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" 

1556 

1557 format_pat = re.compile(r'%(([+-]?[0-9]*(\.)?[0.9]*)*[bcdeEfFgGnoxrsX]?)') 

1558 

1559 def scan_format_string(self, s: str) -> List[re.Match]: 

1560 """Scan the format string s, returning a list match objects.""" 

1561 result = list(re.finditer(self.format_pat, s)) 

1562 return result 

1563 #@+node:ekr.20191222104224.1: *5* fs.scan_rhs 

1564 def scan_rhs(self, node: Node) -> List[Any]: 

1565 """ 

1566 Scan the right-hand side of a potential f-string. 

1567 

1568 Return a list of the token lists for each element. 

1569 """ 

1570 trace = False 

1571 # First, Try the most common cases. 

1572 if isinstance(node, ast.Str): 

1573 token_list = get_node_token_list(node, self.tokens) 

1574 return [token_list] 

1575 if isinstance(node, (list, tuple, ast.Tuple)): 

1576 result = [] 

1577 elts = node.elts if isinstance(node, ast.Tuple) else node 

1578 for i, elt in enumerate(elts): 

1579 tokens = tokens_for_node(self.filename, elt, self.tokens) 

1580 result.append(tokens) 

1581 if trace: # pragma: no cover 

1582 g.trace(f"item: {i}: {elt.__class__.__name__}") 

1583 g.printObj(tokens, tag=f"Tokens for item {i}") 

1584 return result 

1585 # Now we expect only one result. 

1586 tokens = tokens_for_node(self.filename, node, self.tokens) 

1587 return [tokens] 

1588 #@+node:ekr.20191226155316.1: *5* fs.substitute_values 

1589 def substitute_values(self, lt_s: str, specs: List[re.Match], values: List) -> List["Token"]: 

1590 """ 

1591 Replace specifiers with values in lt_s string. 

1592 

1593 Double { and } as needed. 

1594 """ 

1595 i, results = 0, [Token('string', 'f')] 

1596 for spec_i, m in enumerate(specs): 

1597 value = tokens_to_string(values[spec_i]) 

1598 start, end, spec = m.start(0), m.end(0), m.group(1) 

1599 if start > i: 

1600 val = lt_s[i:start].replace('{', '{{').replace('}', '}}') 

1601 results.append(Token('string', val[0])) 

1602 results.append(Token('string', val[1:])) 

1603 head, tail = self.munge_spec(spec) 

1604 results.append(Token('op', '{')) 

1605 results.append(Token('string', value)) 

1606 if head: 

1607 results.append(Token('string', '!')) 

1608 results.append(Token('string', head)) 

1609 if tail: 

1610 results.append(Token('string', ':')) 

1611 results.append(Token('string', tail)) 

1612 results.append(Token('op', '}')) 

1613 i = end 

1614 # Add the tail. 

1615 tail = lt_s[i:] 

1616 if tail: 

1617 tail = tail.replace('{', '{{').replace('}', '}}') 

1618 results.append(Token('string', tail[:-1])) 

1619 results.append(Token('string', tail[-1])) 

1620 return results 

1621 #@+node:ekr.20200214142019.1: *4* fs.message 

1622 def message(self, message: str) -> None: # pragma: no cover. 

1623 """ 

1624 Print one or more message lines aligned on the first colon of the message. 

1625 """ 

1626 # Print a leading blank line. 

1627 print('') 

1628 # Calculate the padding. 

1629 lines = g.splitLines(message) 

1630 pad = max(lines[0].find(':'), 30) 

1631 # Print the first line. 

1632 z = lines[0] 

1633 i = z.find(':') 

1634 if i == -1: 

1635 print(z.rstrip()) 

1636 else: 

1637 print(f"{z[:i+2].strip():>{pad+1}} {z[i+2:].strip()}") 

1638 # Print the remaining message lines. 

1639 for z in lines[1:]: 

1640 if z.startswith('<'): 

1641 # Print left aligned. 

1642 print(z[1:].strip()) 

1643 elif z.startswith(':') and -1 < z[1:].find(':') <= pad: 

1644 # Align with the first line. 

1645 i = z[1:].find(':') 

1646 print(f"{z[1:i+2].strip():>{pad+1}} {z[i+2:].strip()}") 

1647 elif z.startswith('>'): 

1648 # Align after the aligning colon. 

1649 print(f"{' ':>{pad+2}}{z[1:].strip()}") 

1650 else: 

1651 # Default: Put the entire line after the aligning colon. 

1652 print(f"{' ':>{pad+2}}{z.strip()}") 

1653 # Print the standard message lines. 

1654 file_s = f"{'file':>{pad}}" 

1655 ln_n_s = f"{'line number':>{pad}}" 

1656 line_s = f"{'line':>{pad}}" 

1657 print( 

1658 f"{file_s}: {self.filename}\n" 

1659 f"{ln_n_s}: {self.line_number}\n" 

1660 f"{line_s}: {self.line!r}") 

1661 #@+node:ekr.20191225054848.1: *4* fs.replace 

1662 def replace(self, node: Node, s: str, values: List["Token"]) -> None: 

1663 """ 

1664 Replace node with an ast.Str node for s. 

1665 Replace all tokens in the range of values with a single 'string' node. 

1666 """ 

1667 # Replace the tokens... 

1668 tokens = tokens_for_node(self.filename, node, self.tokens) 

1669 i1 = i = tokens[0].index 

1670 replace_token(self.tokens[i], 'string', s) 

1671 j = 1 

1672 while j < len(tokens): 

1673 replace_token(self.tokens[i1 + j], 'killed', '') 

1674 j += 1 

1675 # Replace the node. 

1676 new_node = ast.Str() 

1677 new_node.s = s 

1678 replace_node(new_node, node) 

1679 # Update the token. 

1680 token = self.tokens[i1] 

1681 token.node = new_node 

1682 # Update the token list. 

1683 add_token_to_token_list(token, new_node) 

1684 #@-others 

1685#@+node:ekr.20220330191947.1: *3* class IterativeTokenGenerator 

1686class IterativeTokenGenerator: 

1687 """ 

1688 Self-contained iterative token syncing class. It shows how to traverse 

1689 any tree with neither recursion nor iterators. 

1690 

1691 This class is almost exactly as fast as the TokenOrderGenerator class. 

1692 

1693 This class is another curio: Leo does not use this code. 

1694 

1695 The main_loop method executes **actions**: (method, argument) tuples. 

1696 

1697 The key idea: visitors (and visit), never execute code directly. 

1698 Instead, they queue methods to be executed in the main loop. 

1699 

1700 *Important*: find_next_significant_token must be called only *after* 

1701 actions have eaten all previous tokens. So do_If (and other visitors) 

1702 must queue up **helper actions** for later (delayed) execution. 

1703 """ 

1704 

1705 begin_end_stack: List[str] = [] # A stack of node names. 

1706 n_nodes = 0 # The number of nodes that have been visited. 

1707 node = None # The current node. 

1708 node_index = 0 # The index into the node_stack. 

1709 node_stack: List[ast.AST] = [] # The stack of parent nodes. 

1710 

1711 #@+others 

1712 #@+node:ekr.20220402095550.1: *4* iterative: Init... 

1713 # Same as in the TokenOrderGenerator class. 

1714 #@+node:ekr.20220402095550.2: *5* iterative.balance_tokens 

1715 def balance_tokens(self, tokens: List["Token"]) -> int: 

1716 """ 

1717 TOG.balance_tokens. 

1718 

1719 Insert two-way links between matching paren tokens. 

1720 """ 

1721 count, stack = 0, [] 

1722 for token in tokens: 

1723 if token.kind == 'op': 

1724 if token.value == '(': 

1725 count += 1 

1726 stack.append(token.index) 

1727 if token.value == ')': 

1728 if stack: 

1729 index = stack.pop() 

1730 tokens[index].matching_paren = token.index 

1731 tokens[token.index].matching_paren = index 

1732 else: # pragma: no cover 

1733 g.trace(f"unmatched ')' at index {token.index}") 

1734 if stack: # pragma: no cover 

1735 g.trace("unmatched '(' at {','.join(stack)}") 

1736 return count 

1737 #@+node:ekr.20220402095550.3: *5* iterative.create_links (changed) 

1738 def create_links(self, tokens: List["Token"], tree: Node, file_name: str='') -> List: 

1739 """ 

1740 A generator creates two-way links between the given tokens and ast-tree. 

1741 

1742 Callers should call this generator with list(tog.create_links(...)) 

1743 

1744 The sync_tokens method creates the links and verifies that the resulting 

1745 tree traversal generates exactly the given tokens in exact order. 

1746 

1747 tokens: the list of Token instances for the input. 

1748 Created by make_tokens(). 

1749 tree: the ast tree for the input. 

1750 Created by parse_ast(). 

1751 """ 

1752 # Init all ivars. 

1753 self.file_name = file_name # For tests. 

1754 self.node = None # The node being visited. 

1755 self.tokens = tokens # The immutable list of input tokens. 

1756 self.tree = tree # The tree of ast.AST nodes. 

1757 # Traverse the tree. 

1758 self.main_loop(tree) 

1759 # Ensure that all tokens are patched. 

1760 self.node = tree 

1761 self.token(('endmarker', '')) 

1762 # Return [] for compatibility with legacy code: list(tog.create_links). 

1763 return [] 

1764 #@+node:ekr.20220402095550.4: *5* iterative.init_from_file 

1765 def init_from_file(self, filename: str) -> Tuple[str, str, List["Token"], Node]: # pragma: no cover 

1766 """ 

1767 Create the tokens and ast tree for the given file. 

1768 Create links between tokens and the parse tree. 

1769 Return (contents, encoding, tokens, tree). 

1770 """ 

1771 self.filename = filename 

1772 encoding, contents = read_file_with_encoding(filename) 

1773 if not contents: 

1774 return None, None, None, None 

1775 self.tokens = tokens = make_tokens(contents) 

1776 self.tree = tree = parse_ast(contents) 

1777 self.create_links(tokens, tree) 

1778 return contents, encoding, tokens, tree 

1779 #@+node:ekr.20220402095550.5: *5* iterative.init_from_string 

1780 def init_from_string(self, contents: str, filename: str) -> Tuple[List["Token"], Node]: # pragma: no cover 

1781 """ 

1782 Tokenize, parse and create links in the contents string. 

1783 

1784 Return (tokens, tree). 

1785 """ 

1786 self.filename = filename 

1787 self.tokens = tokens = make_tokens(contents) 

1788 self.tree = tree = parse_ast(contents) 

1789 self.create_links(tokens, tree) 

1790 return tokens, tree 

1791 #@+node:ekr.20220402094825.1: *4* iterative: Synchronizers... 

1792 # These synchronizer methods sync various kinds of tokens to nodes. 

1793 # 

1794 # These methods are (mostly) the same as in the TokenOrderGenerator class. 

1795 # 

1796 # Important: The sync_token in this class has a different signature from its TOG counterpart. 

1797 # This slight difference makes it difficult to reuse the TOG methods, 

1798 # say via monkey-patching. 

1799 # 

1800 # So I just copied/pasted these methods. This strategy suffices 

1801 # to illustrate the ideas presented in this class. 

1802 

1803 #@+node:ekr.20220402094825.2: *5* iterative.find_next_significant_token 

1804 def find_next_significant_token(self) -> Optional["Token"]: 

1805 """ 

1806 Scan from *after* self.tokens[px] looking for the next significant 

1807 token. 

1808 

1809 Return the token, or None. Never change self.px. 

1810 """ 

1811 px = self.px + 1 

1812 while px < len(self.tokens): 

1813 token = self.tokens[px] 

1814 px += 1 

1815 if is_significant_token(token): 

1816 return token 

1817 # This will never happen, because endtoken is significant. 

1818 return None # pragma: no cover 

1819 #@+node:ekr.20220402094825.3: *5* iterative.set_links 

1820 last_statement_node: Optional[Node] = None 

1821 

1822 def set_links(self, node: Node, token: "Token") -> None: 

1823 """Make two-way links between token and the given node.""" 

1824 # Don't bother assigning comment, comma, parens, ws and endtoken tokens. 

1825 if token.kind == 'comment': 

1826 # Append the comment to node.comment_list. 

1827 comment_list: List["Token"] = getattr(node, 'comment_list', []) 

1828 node.comment_list = comment_list + [token] 

1829 return 

1830 if token.kind in ('endmarker', 'ws'): 

1831 return 

1832 if token.kind == 'op' and token.value in ',()': 

1833 return 

1834 # *Always* remember the last statement. 

1835 statement = find_statement_node(node) 

1836 if statement: 

1837 self.last_statement_node = statement 

1838 assert not isinstance(self.last_statement_node, ast.Module) 

1839 if token.node is not None: # pragma: no cover 

1840 line_s = f"line {token.line_number}:" 

1841 raise AssignLinksError( 

1842 f" file: {self.filename}\n" 

1843 f"{line_s:>12} {token.line.strip()}\n" 

1844 f"token index: {self.px}\n" 

1845 f"token.node is not None\n" 

1846 f" token.node: {token.node.__class__.__name__}\n" 

1847 f" callers: {g.callers()}") 

1848 # Assign newlines to the previous statement node, if any. 

1849 if token.kind in ('newline', 'nl'): 

1850 # Set an *auxilliary* link for the split/join logic. 

1851 # Do *not* set token.node! 

1852 token.statement_node = self.last_statement_node 

1853 return 

1854 if is_significant_token(token): 

1855 # Link the token to the ast node. 

1856 token.node = node 

1857 # Add the token to node's token_list. 

1858 add_token_to_token_list(token, node) 

1859 #@+node:ekr.20220402094825.4: *5* iterative.sync_name (aka name) 

1860 def sync_name(self, val: str) -> None: 

1861 aList = val.split('.') 

1862 if len(aList) == 1: 

1863 self.sync_token(('name', val)) 

1864 else: 

1865 for i, part in enumerate(aList): 

1866 self.sync_token(('name', part)) 

1867 if i < len(aList) - 1: 

1868 self.sync_op('.') 

1869 

1870 name = sync_name # for readability. 

1871 #@+node:ekr.20220402094825.5: *5* iterative.sync_op (aka op) 

1872 def sync_op(self, val: str) -> None: 

1873 """ 

1874 Sync to the given operator. 

1875 

1876 val may be '(' or ')' *only* if the parens *will* actually exist in the 

1877 token list. 

1878 """ 

1879 self.sync_token(('op', val)) 

1880 

1881 op = sync_op # For readability. 

1882 #@+node:ekr.20220402094825.6: *5* iterative.sync_token (aka token) 

1883 px = -1 # Index of the previously synced token. 

1884 

1885 def sync_token(self, data: Tuple[Any, Any]) -> None: 

1886 """ 

1887 Sync to a token whose kind & value are given. The token need not be 

1888 significant, but it must be guaranteed to exist in the token list. 

1889 

1890 The checks in this method constitute a strong, ever-present, unit test. 

1891 

1892 Scan the tokens *after* px, looking for a token T matching (kind, val). 

1893 raise AssignLinksError if a significant token is found that doesn't match T. 

1894 Otherwise: 

1895 - Create two-way links between all assignable tokens between px and T. 

1896 - Create two-way links between T and self.node. 

1897 - Advance by updating self.px to point to T. 

1898 """ 

1899 kind, val = data 

1900 node, tokens = self.node, self.tokens 

1901 assert isinstance(node, ast.AST), repr(node) 

1902 # g.trace( 

1903 # f"px: {self.px:2} " 

1904 # f"node: {node.__class__.__name__:<10} " 

1905 # f"kind: {kind:>10}: val: {val!r}") 

1906 # 

1907 # Step one: Look for token T. 

1908 old_px = px = self.px + 1 

1909 while px < len(self.tokens): 

1910 token = tokens[px] 

1911 if (kind, val) == (token.kind, token.value): 

1912 break # Success. 

1913 if kind == token.kind == 'number': 

1914 val = token.value 

1915 break # Benign: use the token's value, a string, instead of a number. 

1916 if is_significant_token(token): # pragma: no cover 

1917 line_s = f"line {token.line_number}:" 

1918 val = str(val) # for g.truncate. 

1919 raise AssignLinksError( 

1920 f" file: {self.filename}\n" 

1921 f"{line_s:>12} {token.line.strip()}\n" 

1922 f"Looking for: {kind}.{g.truncate(val, 40)!r}\n" 

1923 f" found: {token.kind}.{token.value!r}\n" 

1924 f"token.index: {token.index}\n") 

1925 # Skip the insignificant token. 

1926 px += 1 

1927 else: # pragma: no cover 

1928 val = str(val) # for g.truncate. 

1929 raise AssignLinksError( 

1930 f" file: {self.filename}\n" 

1931 f"Looking for: {kind}.{g.truncate(val, 40)}\n" 

1932 f" found: end of token list") 

1933 # 

1934 # Step two: Assign *secondary* links only for newline tokens. 

1935 # Ignore all other non-significant tokens. 

1936 while old_px < px: 

1937 token = tokens[old_px] 

1938 old_px += 1 

1939 if token.kind in ('comment', 'newline', 'nl'): 

1940 self.set_links(node, token) 

1941 # 

1942 # Step three: Set links in the found token. 

1943 token = tokens[px] 

1944 self.set_links(node, token) 

1945 # 

1946 # Step four: Advance. 

1947 self.px = px 

1948 

1949 token = sync_token # For readability. 

1950 #@+node:ekr.20220330164313.1: *4* iterative: Traversal... 

1951 #@+node:ekr.20220402094946.2: *5* iterative.enter_node 

1952 def enter_node(self, node: Node) -> None: 

1953 """Enter a node.""" 

1954 # Update the stats. 

1955 self.n_nodes += 1 

1956 # Create parent/child links first, *before* updating self.node. 

1957 # 

1958 # Don't even *think* about removing the parent/child links. 

1959 # The nearest_common_ancestor function depends upon them. 

1960 node.parent = self.node 

1961 if self.node: 

1962 children: List[Node] = getattr(self.node, 'children', []) 

1963 children.append(node) 

1964 self.node.children = children 

1965 # Inject the node_index field. 

1966 assert not hasattr(node, 'node_index'), g.callers() 

1967 node.node_index = self.node_index 

1968 self.node_index += 1 

1969 # begin_visitor and end_visitor must be paired. 

1970 self.begin_end_stack.append(node.__class__.__name__) 

1971 # Push the previous node. 

1972 self.node_stack.append(self.node) 

1973 # Update self.node *last*. 

1974 self.node = node 

1975 #@+node:ekr.20220402094946.3: *5* iterative.leave_node 

1976 def leave_node(self, node: Node) -> None: 

1977 """Leave a visitor.""" 

1978 # Make *sure* that begin_visitor and end_visitor are paired. 

1979 entry_name = self.begin_end_stack.pop() 

1980 assert entry_name == node.__class__.__name__, f"{entry_name!r} {node.__class__.__name__}" 

1981 assert self.node == node, (repr(self.node), repr(node)) 

1982 # Restore self.node. 

1983 self.node = self.node_stack.pop() 

1984 #@+node:ekr.20220330120220.1: *5* iterative.main_loop 

1985 def main_loop(self, node: Node) -> None: 

1986 

1987 func = getattr(self, 'do_' + node.__class__.__name__, None) 

1988 if not func: # pragma: no cover (defensive code) 

1989 print('main_loop: invalid ast node:', repr(node)) 

1990 return 

1991 exec_list: ActionList = [(func, node)] 

1992 while exec_list: 

1993 func, arg = exec_list.pop(0) 

1994 result = func(arg) 

1995 if result: 

1996 # Prepend the result, a list of tuples. 

1997 assert isinstance(result, list), repr(result) 

1998 exec_list[:0] = result 

1999 

2000 # For debugging... 

2001 # try: 

2002 # func, arg = data 

2003 # if 0: 

2004 # func_name = g.truncate(func.__name__, 15) 

2005 # print( 

2006 # f"{self.node.__class__.__name__:>10}:" 

2007 # f"{func_name:>20} " 

2008 # f"{arg.__class__.__name__}") 

2009 # except ValueError: 

2010 # g.trace('BAD DATA', self.node.__class__.__name__) 

2011 # if isinstance(data, (list, tuple)): 

2012 # for z in data: 

2013 # print(data) 

2014 # else: 

2015 # print(repr(data)) 

2016 # raise 

2017 #@+node:ekr.20220330155314.1: *5* iterative.visit 

2018 def visit(self, node: Node) -> ActionList: 

2019 """'Visit' an ast node by return a new list of tuples.""" 

2020 # Keep this trace. 

2021 if False: # pragma: no cover 

2022 cn = node.__class__.__name__ if node else ' ' 

2023 caller1, caller2 = g.callers(2).split(',') 

2024 g.trace(f"{caller1:>15} {caller2:<14} {cn}") 

2025 if node is None: 

2026 return [] 

2027 # More general, more convenient. 

2028 if isinstance(node, (list, tuple)): 

2029 result = [] 

2030 for z in node: 

2031 if isinstance(z, ast.AST): 

2032 result.append((self.visit, z)) 

2033 else: # pragma: no cover (This might never happen). 

2034 # All other fields should contain ints or strings. 

2035 assert isinstance(z, (int, str)), z.__class__.__name__ 

2036 return result 

2037 # We *do* want to crash if the visitor doesn't exist. 

2038 assert isinstance(node, ast.AST), repr(node) 

2039 method = getattr(self, 'do_' + node.__class__.__name__) 

2040 # Don't call *anything* here. Just return a new list of tuples. 

2041 return [ 

2042 (self.enter_node, node), 

2043 (method, node), 

2044 (self.leave_node, node), 

2045 ] 

2046 #@+node:ekr.20220330133336.1: *4* iterative: Visitors 

2047 #@+node:ekr.20220330133336.2: *5* iterative.keyword: not called! 

2048 # keyword arguments supplied to call (NULL identifier for **kwargs) 

2049 

2050 # keyword = (identifier? arg, expr value) 

2051 

2052 def do_keyword(self, node: Node) -> List: # pragma: no cover 

2053 """A keyword arg in an ast.Call.""" 

2054 # This should never be called. 

2055 # iterative.hande_call_arguments calls self.visit(kwarg_arg.value) instead. 

2056 filename = getattr(self, 'filename', '<no file>') 

2057 raise AssignLinksError( 

2058 f"file: {filename}\n" 

2059 f"do_keyword should never be called\n" 

2060 f"{g.callers(8)}") 

2061 #@+node:ekr.20220330133336.3: *5* iterative: Contexts 

2062 #@+node:ekr.20220330133336.4: *6* iterative.arg 

2063 # arg = (identifier arg, expr? annotation) 

2064 

2065 def do_arg(self, node: Node) -> ActionList: 

2066 """This is one argument of a list of ast.Function or ast.Lambda arguments.""" 

2067 

2068 annotation = getattr(node, 'annotation', None) 

2069 result: ActionList = [ 

2070 (self.name, node.arg), 

2071 ] 

2072 if annotation: 

2073 result.extend([ 

2074 (self.op, ':'), 

2075 (self.visit, annotation), 

2076 ]) 

2077 return result 

2078 

2079 #@+node:ekr.20220330133336.5: *6* iterative.arguments 

2080 # arguments = ( 

2081 # arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs, 

2082 # expr* kw_defaults, arg? kwarg, expr* defaults 

2083 # ) 

2084 

2085 def do_arguments(self, node: Node) -> ActionList: 

2086 """Arguments to ast.Function or ast.Lambda, **not** ast.Call.""" 

2087 # 

2088 # No need to generate commas anywhere below. 

2089 # 

2090 # Let block. Some fields may not exist pre Python 3.8. 

2091 n_plain = len(node.args) - len(node.defaults) 

2092 posonlyargs = getattr(node, 'posonlyargs', []) 

2093 vararg = getattr(node, 'vararg', None) 

2094 kwonlyargs = getattr(node, 'kwonlyargs', []) 

2095 kw_defaults = getattr(node, 'kw_defaults', []) 

2096 kwarg = getattr(node, 'kwarg', None) 

2097 result: ActionList = [] 

2098 # 1. Sync the position-only args. 

2099 if posonlyargs: 

2100 for n, z in enumerate(posonlyargs): 

2101 result.append((self.visit, z)) 

2102 result.append((self.op, '/')) 

2103 # 2. Sync all args. 

2104 for i, z in enumerate(node.args): 

2105 result.append((self.visit, z)) 

2106 if i >= n_plain: 

2107 result.extend([ 

2108 (self.op, '='), 

2109 (self.visit, node.defaults[i - n_plain]), 

2110 ]) 

2111 # 3. Sync the vararg. 

2112 if vararg: 

2113 result.extend([ 

2114 (self.op, '*'), 

2115 (self.visit, vararg), 

2116 ]) 

2117 # 4. Sync the keyword-only args. 

2118 if kwonlyargs: 

2119 if not vararg: 

2120 result.append((self.op, '*')) 

2121 for n, z in enumerate(kwonlyargs): 

2122 result.append((self.visit, z)) 

2123 val = kw_defaults[n] 

2124 if val is not None: 

2125 result.extend([ 

2126 (self.op, '='), 

2127 (self.visit, val), 

2128 ]) 

2129 # 5. Sync the kwarg. 

2130 if kwarg: 

2131 result.extend([ 

2132 (self.op, '**'), 

2133 (self.visit, kwarg), 

2134 ]) 

2135 return result 

2136 

2137 

2138 

2139 #@+node:ekr.20220330133336.6: *6* iterative.AsyncFunctionDef 

2140 # AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list, 

2141 # expr? returns) 

2142 

2143 def do_AsyncFunctionDef(self, node: Node) -> ActionList: 

2144 

2145 returns = getattr(node, 'returns', None) 

2146 result: ActionList = [] 

2147 # Decorators... 

2148 # @{z}\n 

2149 for z in node.decorator_list or []: 

2150 result.extend([ 

2151 (self.op, '@'), 

2152 (self.visit, z) 

2153 ]) 

2154 # Signature... 

2155 # def name(args): -> returns\n 

2156 # def name(args):\n 

2157 result.extend([ 

2158 (self.name, 'async'), 

2159 (self.name, 'def'), 

2160 (self.name, node.name), # A string. 

2161 (self.op, '('), 

2162 (self.visit, node.args), 

2163 (self.op, ')'), 

2164 ]) 

2165 if returns is not None: 

2166 result.extend([ 

2167 (self.op, '->'), 

2168 (self.visit, node.returns), 

2169 ]) 

2170 # Body... 

2171 result.extend([ 

2172 (self.op, ':'), 

2173 (self.visit, node.body), 

2174 ]) 

2175 return result 

2176 #@+node:ekr.20220330133336.7: *6* iterative.ClassDef 

2177 def do_ClassDef(self, node: Node) -> ActionList: 

2178 

2179 result: ActionList = [] 

2180 for z in node.decorator_list or []: 

2181 # @{z}\n 

2182 result.extend([ 

2183 (self.op, '@'), 

2184 (self.visit, z), 

2185 ]) 

2186 # class name(bases):\n 

2187 result.extend([ 

2188 (self.name, 'class'), 

2189 (self.name, node.name), # A string. 

2190 ]) 

2191 if node.bases: 

2192 result.extend([ 

2193 (self.op, '('), 

2194 (self.visit, node.bases), 

2195 (self.op, ')'), 

2196 ]) 

2197 result.extend([ 

2198 (self.op, ':'), 

2199 (self.visit, node.body), 

2200 ]) 

2201 return result 

2202 #@+node:ekr.20220330133336.8: *6* iterative.FunctionDef 

2203 # FunctionDef( 

2204 # identifier name, arguments args, 

2205 # stmt* body, 

2206 # expr* decorator_list, 

2207 # expr? returns, 

2208 # string? type_comment) 

2209 

2210 def do_FunctionDef(self, node: Node) -> ActionList: 

2211 

2212 returns = getattr(node, 'returns', None) 

2213 result: ActionList = [] 

2214 # Decorators... 

2215 # @{z}\n 

2216 for z in node.decorator_list or []: 

2217 result.extend([ 

2218 (self.op, '@'), 

2219 (self.visit, z) 

2220 ]) 

2221 # Signature... 

2222 # def name(args): -> returns\n 

2223 # def name(args):\n 

2224 result.extend([ 

2225 (self.name, 'def'), 

2226 (self.name, node.name), # A string. 

2227 (self.op, '('), 

2228 (self.visit, node.args), 

2229 (self.op, ')'), 

2230 ]) 

2231 if returns is not None: 

2232 result.extend([ 

2233 (self.op, '->'), 

2234 (self.visit, node.returns), 

2235 ]) 

2236 # Body... 

2237 result.extend([ 

2238 (self.op, ':'), 

2239 (self.visit, node.body), 

2240 ]) 

2241 return result 

2242 #@+node:ekr.20220330133336.9: *6* iterative.Interactive 

2243 def do_Interactive(self, node: Node) -> ActionList: # pragma: no cover 

2244 

2245 return [ 

2246 (self.visit, node.body), 

2247 ] 

2248 #@+node:ekr.20220330133336.10: *6* iterative.Lambda 

2249 def do_Lambda(self, node: Node) -> ActionList: 

2250 

2251 return [ 

2252 (self.name, 'lambda'), 

2253 (self.visit, node.args), 

2254 (self.op, ':'), 

2255 (self.visit, node.body), 

2256 ] 

2257 

2258 #@+node:ekr.20220330133336.11: *6* iterative.Module 

2259 def do_Module(self, node: Node) -> ActionList: 

2260 

2261 # Encoding is a non-syncing statement. 

2262 return [ 

2263 (self.visit, node.body), 

2264 ] 

2265 #@+node:ekr.20220330133336.12: *5* iterative: Expressions 

2266 #@+node:ekr.20220330133336.13: *6* iterative.Expr 

2267 def do_Expr(self, node: Node) -> ActionList: 

2268 """An outer expression.""" 

2269 # No need to put parentheses. 

2270 return [ 

2271 (self.visit, node.value), 

2272 ] 

2273 #@+node:ekr.20220330133336.14: *6* iterative.Expression 

2274 def do_Expression(self, node: Node) -> ActionList: # pragma: no cover 

2275 """An inner expression.""" 

2276 # No need to put parentheses. 

2277 return [ 

2278 (self.visit, node.body), 

2279 ] 

2280 #@+node:ekr.20220330133336.15: *6* iterative.GeneratorExp 

2281 def do_GeneratorExp(self, node: Node) -> ActionList: 

2282 # '<gen %s for %s>' % (elt, ','.join(gens)) 

2283 # No need to put parentheses or commas. 

2284 return [ 

2285 (self.visit, node.elt), 

2286 (self.visit, node.generators), 

2287 ] 

2288 #@+node:ekr.20220330133336.16: *6* iterative.NamedExpr 

2289 # NamedExpr(expr target, expr value) 

2290 

2291 def do_NamedExpr(self, node: Node) -> ActionList: # Python 3.8+ 

2292 

2293 return [ 

2294 (self.visit, node.target), 

2295 (self.op, ':='), 

2296 (self.visit, node.value), 

2297 ] 

2298 #@+node:ekr.20220402160128.1: *5* iterative: Operands 

2299 #@+node:ekr.20220402160128.2: *6* iterative.Attribute 

2300 # Attribute(expr value, identifier attr, expr_context ctx) 

2301 

2302 def do_Attribute(self, node: Node) -> ActionList: 

2303 

2304 return [ 

2305 (self.visit, node.value), 

2306 (self.op, '.'), 

2307 (self.name, node.attr), # A string. 

2308 ] 

2309 #@+node:ekr.20220402160128.3: *6* iterative.Bytes 

2310 def do_Bytes(self, node: Node) -> ActionList: 

2311 

2312 """ 

2313 It's invalid to mix bytes and non-bytes literals, so just 

2314 advancing to the next 'string' token suffices. 

2315 """ 

2316 token = self.find_next_significant_token() 

2317 return [ 

2318 (self.token, ('string', token.value)), 

2319 ] 

2320 #@+node:ekr.20220402160128.4: *6* iterative.comprehension 

2321 # comprehension = (expr target, expr iter, expr* ifs, int is_async) 

2322 

2323 def do_comprehension(self, node: Node) -> ActionList: 

2324 

2325 # No need to put parentheses. 

2326 result: ActionList = [ 

2327 (self.name, 'for'), 

2328 (self.visit, node.target), # A name 

2329 (self.name, 'in'), 

2330 (self.visit, node.iter), 

2331 ] 

2332 for z in node.ifs or []: 

2333 result.extend([ 

2334 (self.name, 'if'), 

2335 (self.visit, z), 

2336 ]) 

2337 return result 

2338 #@+node:ekr.20220402160128.5: *6* iterative.Constant 

2339 def do_Constant(self, node: Node) -> ActionList: # pragma: no cover 

2340 """ 

2341 

2342 https://greentreesnakes.readthedocs.io/en/latest/nodes.html 

2343 

2344 A constant. The value attribute holds the Python object it represents. 

2345 This can be simple types such as a number, string or None, but also 

2346 immutable container types (tuples and frozensets) if all of their 

2347 elements are constant. 

2348 """ 

2349 # Support Python 3.8. 

2350 if node.value is None or isinstance(node.value, bool): 

2351 # Weird: return a name! 

2352 return [ 

2353 (self.token, ('name', repr(node.value))), 

2354 ] 

2355 if node.value == Ellipsis: 

2356 return [ 

2357 (self.op, '...'), 

2358 ] 

2359 if isinstance(node.value, str): 

2360 return self.do_Str(node) 

2361 if isinstance(node.value, (int, float)): 

2362 return [ 

2363 (self.token, ('number', repr(node.value))), 

2364 ] 

2365 if isinstance(node.value, bytes): 

2366 return self.do_Bytes(node) 

2367 if isinstance(node.value, tuple): 

2368 return self.do_Tuple(node) 

2369 if isinstance(node.value, frozenset): 

2370 return self.do_Set(node) 

2371 g.trace('----- Oops -----', repr(node.value), g.callers()) 

2372 return [] 

2373 

2374 #@+node:ekr.20220402160128.6: *6* iterative.Dict 

2375 # Dict(expr* keys, expr* values) 

2376 

2377 def do_Dict(self, node: Node) -> ActionList: 

2378 

2379 assert len(node.keys) == len(node.values) 

2380 result: ActionList = [ 

2381 (self.op, '{'), 

2382 ] 

2383 # No need to put commas. 

2384 for i, key in enumerate(node.keys): 

2385 key, value = node.keys[i], node.values[i] 

2386 result.extend([ 

2387 (self.visit, key), # a Str node. 

2388 (self.op, ':'), 

2389 ]) 

2390 if value is not None: 

2391 result.append((self.visit, value)) 

2392 result.append((self.op, '}')) 

2393 return result 

2394 #@+node:ekr.20220402160128.7: *6* iterative.DictComp 

2395 # DictComp(expr key, expr value, comprehension* generators) 

2396 

2397 # d2 = {val: key for key, val in d} 

2398 

2399 def do_DictComp(self, node: Node) -> ActionList: 

2400 

2401 result: ActionList = [ 

2402 (self.token, ('op', '{')), 

2403 (self.visit, node.key), 

2404 (self.op, ':'), 

2405 (self.visit, node.value), 

2406 ] 

2407 for z in node.generators or []: 

2408 result.extend([ 

2409 (self.visit, z), 

2410 (self.token, ('op', '}')), 

2411 ]) 

2412 return result 

2413 

2414 #@+node:ekr.20220402160128.8: *6* iterative.Ellipsis 

2415 def do_Ellipsis(self, node: Node) -> ActionList: # pragma: no cover (Does not exist for python 3.8+) 

2416 

2417 return [ 

2418 (self.op, '...'), 

2419 ] 

2420 #@+node:ekr.20220402160128.9: *6* iterative.ExtSlice 

2421 # https://docs.python.org/3/reference/expressions.html#slicings 

2422 

2423 # ExtSlice(slice* dims) 

2424 

2425 def do_ExtSlice(self, node: Node) -> ActionList: # pragma: no cover (deprecated) 

2426 

2427 result: ActionList = [] 

2428 for i, z in enumerate(node.dims): 

2429 result.append((self.visit, z)) 

2430 if i < len(node.dims) - 1: 

2431 result.append((self.op, ',')) 

2432 return result 

2433 #@+node:ekr.20220402160128.10: *6* iterative.Index 

2434 def do_Index(self, node: Node) -> ActionList: # pragma: no cover (deprecated) 

2435 

2436 return [ 

2437 (self.visit, node.value), 

2438 ] 

2439 #@+node:ekr.20220402160128.11: *6* iterative.FormattedValue: not called! 

2440 # FormattedValue(expr value, int? conversion, expr? format_spec) 

2441 

2442 def do_FormattedValue(self, node: Node) -> ActionList: # pragma: no cover 

2443 """ 

2444 This node represents the *components* of a *single* f-string. 

2445 

2446 Happily, JoinedStr nodes *also* represent *all* f-strings, 

2447 so the TOG should *never visit this node! 

2448 """ 

2449 filename = getattr(self, 'filename', '<no file>') 

2450 raise AssignLinksError( 

2451 f"file: {filename}\n" 

2452 f"do_FormattedValue should never be called") 

2453 

2454 # This code has no chance of being useful... 

2455 # conv = node.conversion 

2456 # spec = node.format_spec 

2457 # self.visit(node.value) 

2458 # if conv is not None: 

2459 # self.token('number', conv) 

2460 # if spec is not None: 

2461 # self.visit(node.format_spec) 

2462 #@+node:ekr.20220402160128.12: *6* iterative.JoinedStr & helpers 

2463 # JoinedStr(expr* values) 

2464 

2465 def do_JoinedStr(self, node: Node) -> ActionList: 

2466 """ 

2467 JoinedStr nodes represent at least one f-string and all other strings 

2468 concatentated to it. 

2469 

2470 Analyzing JoinedStr.values would be extremely tricky, for reasons that 

2471 need not be explained here. 

2472 

2473 Instead, we get the tokens *from the token list itself*! 

2474 """ 

2475 return [ 

2476 (self.token, (z.kind, z.value)) 

2477 for z in self.get_concatenated_string_tokens() 

2478 ] 

2479 #@+node:ekr.20220402160128.13: *6* iterative.List 

2480 def do_List(self, node: Node) -> ActionList: 

2481 

2482 # No need to put commas. 

2483 return [ 

2484 (self.op, '['), 

2485 (self.visit, node.elts), 

2486 (self.op, ']'), 

2487 ] 

2488 #@+node:ekr.20220402160128.14: *6* iterative.ListComp 

2489 # ListComp(expr elt, comprehension* generators) 

2490 

2491 def do_ListComp(self, node: Node) -> ActionList: 

2492 

2493 result: ActionList = [ 

2494 (self.op, '['), 

2495 (self.visit, node.elt), 

2496 ] 

2497 for z in node.generators: 

2498 result.append((self.visit, z)) 

2499 result.append((self.op, ']')) 

2500 return result 

2501 #@+node:ekr.20220402160128.15: *6* iterative.Name & NameConstant 

2502 def do_Name(self, node: Node) -> ActionList: 

2503 

2504 return [ 

2505 (self.name, node.id), 

2506 ] 

2507 

2508 def do_NameConstant(self, node: Node) -> ActionList: # pragma: no cover (Does not exist in Python 3.8+) 

2509 

2510 return [ 

2511 (self.name, repr(node.value)), 

2512 ] 

2513 #@+node:ekr.20220402160128.16: *6* iterative.Num 

2514 def do_Num(self, node: Node) -> ActionList: # pragma: no cover (Does not exist in Python 3.8+) 

2515 

2516 return [ 

2517 (self.token, ('number', node.n)), 

2518 ] 

2519 #@+node:ekr.20220402160128.17: *6* iterative.Set 

2520 # Set(expr* elts) 

2521 

2522 def do_Set(self, node: Node) -> ActionList: 

2523 

2524 return [ 

2525 (self.op, '{'), 

2526 (self.visit, node.elts), 

2527 (self.op, '}'), 

2528 ] 

2529 #@+node:ekr.20220402160128.18: *6* iterative.SetComp 

2530 # SetComp(expr elt, comprehension* generators) 

2531 

2532 def do_SetComp(self, node: Node) -> ActionList: 

2533 

2534 result: ActionList = [ 

2535 (self.op, '{'), 

2536 (self.visit, node.elt), 

2537 ] 

2538 for z in node.generators or []: 

2539 result.append((self.visit, z)) 

2540 result.append((self.op, '}')) 

2541 return result 

2542 #@+node:ekr.20220402160128.19: *6* iterative.Slice 

2543 # slice = Slice(expr? lower, expr? upper, expr? step) 

2544 

2545 def do_Slice(self, node: Node) -> ActionList: 

2546 

2547 lower = getattr(node, 'lower', None) 

2548 upper = getattr(node, 'upper', None) 

2549 step = getattr(node, 'step', None) 

2550 result: ActionList = [] 

2551 if lower is not None: 

2552 result.append((self.visit, lower)) 

2553 # Always put the colon between upper and lower. 

2554 result.append((self.op, ':')) 

2555 if upper is not None: 

2556 result.append((self.visit, upper)) 

2557 # Put the second colon if it exists in the token list. 

2558 if step is None: 

2559 result.append((self.slice_helper, node)) 

2560 else: 

2561 result.extend([ 

2562 (self.op, ':'), 

2563 (self.visit, step), 

2564 ]) 

2565 return result 

2566 

2567 def slice_helper(self, node: Node) -> ActionList: 

2568 """Delayed evaluation!""" 

2569 token = self.find_next_significant_token() 

2570 if token and token.value == ':': 

2571 return [ 

2572 (self.op, ':'), 

2573 ] 

2574 return [] 

2575 #@+node:ekr.20220402160128.20: *6* iterative.Str & helper 

2576 def do_Str(self, node: Node) -> ActionList: 

2577 """This node represents a string constant.""" 

2578 # This loop is necessary to handle string concatenation. 

2579 return [ 

2580 (self.token, (z.kind, z.value)) 

2581 for z in self.get_concatenated_string_tokens() 

2582 ] 

2583 

2584 #@+node:ekr.20220402160128.21: *7* iterative.get_concatenated_tokens 

2585 def get_concatenated_string_tokens(self) -> List: 

2586 """ 

2587 Return the next 'string' token and all 'string' tokens concatenated to 

2588 it. *Never* update self.px here. 

2589 """ 

2590 trace = False 

2591 tag = 'iterative.get_concatenated_string_tokens' 

2592 i = self.px 

2593 # First, find the next significant token. It should be a string. 

2594 i, token = i + 1, None 

2595 while i < len(self.tokens): 

2596 token = self.tokens[i] 

2597 i += 1 

2598 if token.kind == 'string': 

2599 # Rescan the string. 

2600 i -= 1 

2601 break 

2602 # An error. 

2603 if is_significant_token(token): # pragma: no cover 

2604 break 

2605 # Raise an error if we didn't find the expected 'string' token. 

2606 if not token or token.kind != 'string': # pragma: no cover 

2607 if not token: 

2608 token = self.tokens[-1] 

2609 filename = getattr(self, 'filename', '<no filename>') 

2610 raise AssignLinksError( 

2611 f"\n" 

2612 f"{tag}...\n" 

2613 f"file: {filename}\n" 

2614 f"line: {token.line_number}\n" 

2615 f" i: {i}\n" 

2616 f"expected 'string' token, got {token!s}") 

2617 # Accumulate string tokens. 

2618 assert self.tokens[i].kind == 'string' 

2619 results = [] 

2620 while i < len(self.tokens): 

2621 token = self.tokens[i] 

2622 i += 1 

2623 if token.kind == 'string': 

2624 results.append(token) 

2625 elif token.kind == 'op' or is_significant_token(token): 

2626 # Any significant token *or* any op will halt string concatenation. 

2627 break 

2628 # 'ws', 'nl', 'newline', 'comment', 'indent', 'dedent', etc. 

2629 # The (significant) 'endmarker' token ensures we will have result. 

2630 assert results 

2631 if trace: # pragma: no cover 

2632 g.printObj(results, tag=f"{tag}: Results") 

2633 return results 

2634 #@+node:ekr.20220402160128.22: *6* iterative.Subscript 

2635 # Subscript(expr value, slice slice, expr_context ctx) 

2636 

2637 def do_Subscript(self, node: Node) -> ActionList: 

2638 

2639 return [ 

2640 (self.visit, node.value), 

2641 (self.op, '['), 

2642 (self.visit, node.slice), 

2643 (self.op, ']'), 

2644 ] 

2645 #@+node:ekr.20220402160128.23: *6* iterative.Tuple 

2646 # Tuple(expr* elts, expr_context ctx) 

2647 

2648 def do_Tuple(self, node: Node) -> ActionList: 

2649 

2650 # Do not call gen_op for parens or commas here. 

2651 # They do not necessarily exist in the token list! 

2652 

2653 return [ 

2654 (self.visit, node.elts), 

2655 ] 

2656 #@+node:ekr.20220330133336.40: *5* iterative: Operators 

2657 #@+node:ekr.20220330133336.41: *6* iterative.BinOp 

2658 def do_BinOp(self, node: Node) -> ActionList: 

2659 

2660 return [ 

2661 (self.visit, node.left), 

2662 (self.op, op_name(node.op)), 

2663 (self.visit, node.right), 

2664 ] 

2665 

2666 #@+node:ekr.20220330133336.42: *6* iterative.BoolOp 

2667 # BoolOp(boolop op, expr* values) 

2668 

2669 def do_BoolOp(self, node: Node) -> ActionList: 

2670 

2671 result: ActionList = [] 

2672 op_name_ = op_name(node.op) 

2673 for i, z in enumerate(node.values): 

2674 result.append((self.visit, z)) 

2675 if i < len(node.values) - 1: 

2676 result.append((self.name, op_name_)) 

2677 return result 

2678 #@+node:ekr.20220330133336.43: *6* iterative.Compare 

2679 # Compare(expr left, cmpop* ops, expr* comparators) 

2680 

2681 def do_Compare(self, node: Node) -> ActionList: 

2682 

2683 assert len(node.ops) == len(node.comparators) 

2684 result: ActionList = [(self.visit, node.left)] 

2685 for i, z in enumerate(node.ops): 

2686 op_name_ = op_name(node.ops[i]) 

2687 if op_name_ in ('not in', 'is not'): 

2688 for z in op_name_.split(' '): 

2689 result.append((self.name, z)) 

2690 elif op_name_.isalpha(): 

2691 result.append((self.name, op_name_)) 

2692 else: 

2693 result.append((self.op, op_name_)) 

2694 result.append((self.visit, node.comparators[i])) 

2695 return result 

2696 #@+node:ekr.20220330133336.44: *6* iterative.UnaryOp 

2697 def do_UnaryOp(self, node: Node) -> ActionList: 

2698 

2699 op_name_ = op_name(node.op) 

2700 result: ActionList = [] 

2701 if op_name_.isalpha(): 

2702 result.append((self.name, op_name_)) 

2703 else: 

2704 result.append((self.op, op_name_)) 

2705 result.append((self.visit, node.operand)) 

2706 return result 

2707 #@+node:ekr.20220330133336.45: *6* iterative.IfExp (ternary operator) 

2708 # IfExp(expr test, expr body, expr orelse) 

2709 

2710 def do_IfExp(self, node: Node) -> ActionList: 

2711 

2712 #'%s if %s else %s' 

2713 return [ 

2714 (self.visit, node.body), 

2715 (self.name, 'if'), 

2716 (self.visit, node.test), 

2717 (self.name, 'else'), 

2718 (self.visit, node.orelse), 

2719 ] 

2720 

2721 #@+node:ekr.20220330133336.46: *5* iterative: Statements 

2722 #@+node:ekr.20220330133336.47: *6* iterative.Starred 

2723 # Starred(expr value, expr_context ctx) 

2724 

2725 def do_Starred(self, node: Node) -> ActionList: 

2726 """A starred argument to an ast.Call""" 

2727 return [ 

2728 (self.op, '*'), 

2729 (self.visit, node.value), 

2730 ] 

2731 #@+node:ekr.20220330133336.48: *6* iterative.AnnAssign 

2732 # AnnAssign(expr target, expr annotation, expr? value, int simple) 

2733 

2734 def do_AnnAssign(self, node: Node) -> ActionList: 

2735 

2736 # {node.target}:{node.annotation}={node.value}\n' 

2737 result: ActionList = [ 

2738 (self.visit, node.target), 

2739 (self.op, ':'), 

2740 (self.visit, node.annotation), 

2741 ] 

2742 if node.value is not None: # #1851 

2743 result.extend([ 

2744 (self.op, '='), 

2745 (self.visit, node.value), 

2746 ]) 

2747 return result 

2748 #@+node:ekr.20220330133336.49: *6* iterative.Assert 

2749 # Assert(expr test, expr? msg) 

2750 

2751 def do_Assert(self, node: Node) -> ActionList: 

2752 

2753 # No need to put parentheses or commas. 

2754 msg = getattr(node, 'msg', None) 

2755 result: ActionList = [ 

2756 (self.name, 'assert'), 

2757 (self.visit, node.test), 

2758 ] 

2759 if msg is not None: 

2760 result.append((self.visit, node.msg)) 

2761 return result 

2762 #@+node:ekr.20220330133336.50: *6* iterative.Assign 

2763 def do_Assign(self, node: Node) -> ActionList: 

2764 

2765 result: ActionList = [] 

2766 for z in node.targets: 

2767 result.extend([ 

2768 (self.visit, z), 

2769 (self.op, '=') 

2770 ]) 

2771 result.append((self.visit, node.value)) 

2772 return result 

2773 #@+node:ekr.20220330133336.51: *6* iterative.AsyncFor 

2774 def do_AsyncFor(self, node: Node) -> ActionList: 

2775 

2776 # The def line... 

2777 # Py 3.8 changes the kind of token. 

2778 async_token_type = 'async' if has_async_tokens else 'name' 

2779 result: ActionList = [ 

2780 (self.token, (async_token_type, 'async')), 

2781 (self.name, 'for'), 

2782 (self.visit, node.target), 

2783 (self.name, 'in'), 

2784 (self.visit, node.iter), 

2785 (self.op, ':'), 

2786 # Body... 

2787 (self.visit, node.body), 

2788 ] 

2789 # Else clause... 

2790 if node.orelse: 

2791 result.extend([ 

2792 (self.name, 'else'), 

2793 (self.op, ':'), 

2794 (self.visit, node.orelse), 

2795 ]) 

2796 return result 

2797 #@+node:ekr.20220330133336.52: *6* iterative.AsyncWith 

2798 def do_AsyncWith(self, node: Node) -> ActionList: 

2799 

2800 async_token_type = 'async' if has_async_tokens else 'name' 

2801 return [ 

2802 (self.token, (async_token_type, 'async')), 

2803 (self.do_With, node), 

2804 ] 

2805 #@+node:ekr.20220330133336.53: *6* iterative.AugAssign 

2806 # AugAssign(expr target, operator op, expr value) 

2807 

2808 def do_AugAssign(self, node: Node) -> ActionList: 

2809 

2810 # %s%s=%s\n' 

2811 return [ 

2812 (self.visit, node.target), 

2813 (self.op, op_name(node.op) + '='), 

2814 (self.visit, node.value), 

2815 ] 

2816 #@+node:ekr.20220330133336.54: *6* iterative.Await 

2817 # Await(expr value) 

2818 

2819 def do_Await(self, node: Node) -> ActionList: 

2820 

2821 #'await %s\n' 

2822 async_token_type = 'await' if has_async_tokens else 'name' 

2823 return [ 

2824 (self.token, (async_token_type, 'await')), 

2825 (self.visit, node.value), 

2826 ] 

2827 #@+node:ekr.20220330133336.55: *6* iterative.Break 

2828 def do_Break(self, node: Node) -> ActionList: 

2829 

2830 return [ 

2831 (self.name, 'break'), 

2832 ] 

2833 #@+node:ekr.20220330133336.56: *6* iterative.Call & helpers 

2834 # Call(expr func, expr* args, keyword* keywords) 

2835 

2836 # Python 3 ast.Call nodes do not have 'starargs' or 'kwargs' fields. 

2837 

2838 def do_Call(self, node: Node) -> ActionList: 

2839 

2840 # The calls to op(')') and op('(') do nothing by default. 

2841 # No need to generate any commas. 

2842 # Subclasses might handle them in an overridden iterative.set_links. 

2843 return [ 

2844 (self.visit, node.func), 

2845 (self.op, '('), 

2846 (self.handle_call_arguments, node), 

2847 (self.op, ')'), 

2848 ] 

2849 #@+node:ekr.20220330133336.57: *7* iterative.arg_helper 

2850 def arg_helper(self, node: Node) -> ActionList: 

2851 """ 

2852 Yield the node, with a special case for strings. 

2853 """ 

2854 result: ActionList = [] 

2855 if isinstance(node, str): 

2856 result.append((self.token, ('name', node))) 

2857 else: 

2858 result.append((self.visit, node)) 

2859 return result 

2860 #@+node:ekr.20220330133336.58: *7* iterative.handle_call_arguments 

2861 def handle_call_arguments(self, node: Node) -> ActionList: 

2862 """ 

2863 Generate arguments in the correct order. 

2864 

2865 Call(expr func, expr* args, keyword* keywords) 

2866 

2867 https://docs.python.org/3/reference/expressions.html#calls 

2868 

2869 Warning: This code will fail on Python 3.8 only for calls 

2870 containing kwargs in unexpected places. 

2871 """ 

2872 # *args: in node.args[]: Starred(value=Name(id='args')) 

2873 # *[a, 3]: in node.args[]: Starred(value=List(elts=[Name(id='a'), Num(n=3)]) 

2874 # **kwargs: in node.keywords[]: keyword(arg=None, value=Name(id='kwargs')) 

2875 # 

2876 # Scan args for *name or *List 

2877 args = node.args or [] 

2878 keywords = node.keywords or [] 

2879 

2880 def get_pos(obj: Any) -> Tuple: 

2881 line1 = getattr(obj, 'lineno', None) 

2882 col1 = getattr(obj, 'col_offset', None) 

2883 return line1, col1, obj 

2884 

2885 def sort_key(aTuple: Tuple) -> int: 

2886 line, col, obj = aTuple 

2887 return line * 1000 + col 

2888 

2889 assert py_version >= (3, 9) 

2890 

2891 places = [get_pos(z) for z in args + keywords] 

2892 places.sort(key=sort_key) 

2893 ordered_args = [z[2] for z in places] 

2894 result: ActionList = [] 

2895 for z in ordered_args: 

2896 if isinstance(z, ast.Starred): 

2897 result.extend([ 

2898 (self.op, '*'), 

2899 (self.visit, z.value), 

2900 ]) 

2901 elif isinstance(z, ast.keyword): 

2902 if getattr(z, 'arg', None) is None: 

2903 result.extend([ 

2904 (self.op, '**'), 

2905 (self.arg_helper, z.value), 

2906 ]) 

2907 else: 

2908 result.extend([ 

2909 (self.arg_helper, z.arg), 

2910 (self.op, '='), 

2911 (self.arg_helper, z.value), 

2912 ]) 

2913 else: 

2914 result.append((self.arg_helper, z)) 

2915 return result 

2916 #@+node:ekr.20220330133336.59: *6* iterative.Continue 

2917 def do_Continue(self, node: Node) -> ActionList: 

2918 

2919 return [ 

2920 (self.name, 'continue'), 

2921 ] 

2922 #@+node:ekr.20220330133336.60: *6* iterative.Delete 

2923 def do_Delete(self, node: Node) -> ActionList: 

2924 

2925 # No need to put commas. 

2926 return [ 

2927 (self.name, 'del'), 

2928 (self.visit, node.targets), 

2929 ] 

2930 #@+node:ekr.20220330133336.61: *6* iterative.ExceptHandler 

2931 def do_ExceptHandler(self, node: Node) -> ActionList: 

2932 

2933 # Except line... 

2934 result: ActionList = [ 

2935 (self.name, 'except'), 

2936 ] 

2937 if getattr(node, 'type', None): 

2938 result.append((self.visit, node.type)) 

2939 if getattr(node, 'name', None): 

2940 result.extend([ 

2941 (self.name, 'as'), 

2942 (self.name, node.name), 

2943 ]) 

2944 result.extend([ 

2945 (self.op, ':'), 

2946 # Body... 

2947 (self.visit, node.body), 

2948 ]) 

2949 return result 

2950 #@+node:ekr.20220330133336.62: *6* iterative.For 

2951 def do_For(self, node: Node) -> ActionList: 

2952 

2953 result: ActionList = [ 

2954 # The def line... 

2955 (self.name, 'for'), 

2956 (self.visit, node.target), 

2957 (self.name, 'in'), 

2958 (self.visit, node.iter), 

2959 (self.op, ':'), 

2960 # Body... 

2961 (self.visit, node.body), 

2962 ] 

2963 # Else clause... 

2964 if node.orelse: 

2965 result.extend([ 

2966 (self.name, 'else'), 

2967 (self.op, ':'), 

2968 (self.visit, node.orelse), 

2969 ]) 

2970 return result 

2971 #@+node:ekr.20220330133336.63: *6* iterative.Global 

2972 # Global(identifier* names) 

2973 

2974 def do_Global(self, node: Node) -> ActionList: 

2975 

2976 result = [ 

2977 (self.name, 'global'), 

2978 ] 

2979 for z in node.names: 

2980 result.append((self.name, z)) 

2981 return result 

2982 #@+node:ekr.20220330133336.64: *6* iterative.If & helpers 

2983 # If(expr test, stmt* body, stmt* orelse) 

2984 

2985 def do_If(self, node: Node) -> ActionList: 

2986 #@+<< do_If docstring >> 

2987 #@+node:ekr.20220330133336.65: *7* << do_If docstring >> 

2988 """ 

2989 The parse trees for the following are identical! 

2990 

2991 if 1: if 1: 

2992 pass pass 

2993 else: elif 2: 

2994 if 2: pass 

2995 pass 

2996 

2997 So there is *no* way for the 'if' visitor to disambiguate the above two 

2998 cases from the parse tree alone. 

2999 

3000 Instead, we scan the tokens list for the next 'if', 'else' or 'elif' token. 

3001 """ 

3002 #@-<< do_If docstring >> 

3003 # Use the next significant token to distinguish between 'if' and 'elif'. 

3004 token = self.find_next_significant_token() 

3005 result: ActionList = [ 

3006 (self.name, token.value), 

3007 (self.visit, node.test), 

3008 (self.op, ':'), 

3009 # Body... 

3010 (self.visit, node.body), 

3011 ] 

3012 # Else and elif clauses... 

3013 if node.orelse: 

3014 # We *must* delay the evaluation of the else clause. 

3015 result.append((self.if_else_helper, node)) 

3016 return result 

3017 

3018 def if_else_helper(self, node: Node) -> ActionList: 

3019 """Delayed evaluation!""" 

3020 token = self.find_next_significant_token() 

3021 if token.value == 'else': 

3022 return [ 

3023 (self.name, 'else'), 

3024 (self.op, ':'), 

3025 (self.visit, node.orelse), 

3026 ] 

3027 return [ 

3028 (self.visit, node.orelse), 

3029 ] 

3030 #@+node:ekr.20220330133336.66: *6* iterative.Import & helper 

3031 def do_Import(self, node: Node) -> ActionList: 

3032 

3033 result: ActionList = [ 

3034 (self.name, 'import'), 

3035 ] 

3036 for alias in node.names: 

3037 result.append((self.name, alias.name)) 

3038 if alias.asname: 

3039 result.extend([ 

3040 (self.name, 'as'), 

3041 (self.name, alias.asname), 

3042 ]) 

3043 return result 

3044 #@+node:ekr.20220330133336.67: *6* iterative.ImportFrom 

3045 # ImportFrom(identifier? module, alias* names, int? level) 

3046 

3047 def do_ImportFrom(self, node: Node) -> ActionList: 

3048 

3049 result: ActionList = [ 

3050 (self.name, 'from'), 

3051 ] 

3052 for i in range(node.level): 

3053 result.append((self.op, '.')) 

3054 if node.module: 

3055 result.append((self.name, node.module)) 

3056 result.append((self.name, 'import')) 

3057 # No need to put commas. 

3058 for alias in node.names: 

3059 if alias.name == '*': # #1851. 

3060 result.append((self.op, '*')) 

3061 else: 

3062 result.append((self.name, alias.name)) 

3063 if alias.asname: 

3064 result.extend([ 

3065 (self.name, 'as'), 

3066 (self.name, alias.asname), 

3067 ]) 

3068 return result 

3069 #@+node:ekr.20220402124844.1: *6* iterative.Match* (Python 3.10+) 

3070 # Match(expr subject, match_case* cases) 

3071 

3072 # match_case = (pattern pattern, expr? guard, stmt* body) 

3073 

3074 # Full syntax diagram: # https://peps.python.org/pep-0634/#appendix-a 

3075 

3076 def do_Match(self, node: Node) -> ActionList: 

3077 

3078 cases = getattr(node, 'cases', []) 

3079 result: ActionList = [ 

3080 (self.name, 'match'), 

3081 (self.visit, node.subject), 

3082 (self.op, ':'), 

3083 ] 

3084 for case in cases: 

3085 result.append((self.visit, case)) 

3086 return result 

3087 #@+node:ekr.20220402124844.2: *7* iterative.match_case 

3088 # match_case = (pattern pattern, expr? guard, stmt* body) 

3089 

3090 def do_match_case(self, node: Node) -> ActionList: 

3091 

3092 guard = getattr(node, 'guard', None) 

3093 body = getattr(node, 'body', []) 

3094 result: ActionList = [ 

3095 (self.name, 'case'), 

3096 (self.visit, node.pattern), 

3097 ] 

3098 if guard: 

3099 result.extend([ 

3100 (self.name, 'if'), 

3101 (self.visit, guard), 

3102 ]) 

3103 result.append((self.op, ':')) 

3104 for statement in body: 

3105 result.append((self.visit, statement)) 

3106 return result 

3107 #@+node:ekr.20220402124844.3: *7* iterative.MatchAs 

3108 # MatchAs(pattern? pattern, identifier? name) 

3109 

3110 def do_MatchAs(self, node: Node) -> ActionList: 

3111 pattern = getattr(node, 'pattern', None) 

3112 name = getattr(node, 'name', None) 

3113 result: ActionList = [] 

3114 if pattern and name: 

3115 result.extend([ 

3116 (self.visit, pattern), 

3117 (self.name, 'as'), 

3118 (self.name, name), 

3119 ]) 

3120 elif pattern: 

3121 result.append((self.visit, pattern)) # pragma: no cover 

3122 else: 

3123 result.append((self.name, name or '_')) 

3124 return result 

3125 #@+node:ekr.20220402124844.4: *7* iterative.MatchClass 

3126 # MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns) 

3127 

3128 def do_MatchClass(self, node: Node) -> ActionList: 

3129 

3130 patterns = getattr(node, 'patterns', []) 

3131 kwd_attrs = getattr(node, 'kwd_attrs', []) 

3132 kwd_patterns = getattr(node, 'kwd_patterns', []) 

3133 result: ActionList = [ 

3134 (self.visit, node.cls), 

3135 (self.op, '('), 

3136 ] 

3137 for pattern in patterns: 

3138 result.append((self.visit, pattern)) 

3139 for i, kwd_attr in enumerate(kwd_attrs): 

3140 result.extend([ 

3141 (self.name, kwd_attr), # a String. 

3142 (self.op, '='), 

3143 (self.visit, kwd_patterns[i]), 

3144 ]) 

3145 result.append((self.op, ')')) 

3146 return result 

3147 #@+node:ekr.20220402124844.5: *7* iterative.MatchMapping 

3148 # MatchMapping(expr* keys, pattern* patterns, identifier? rest) 

3149 

3150 def do_MatchMapping(self, node: Node) -> ActionList: 

3151 keys = getattr(node, 'keys', []) 

3152 patterns = getattr(node, 'patterns', []) 

3153 rest = getattr(node, 'rest', None) 

3154 result: ActionList = [ 

3155 (self.op, '{'), 

3156 ] 

3157 for i, key in enumerate(keys): 

3158 result.extend([ 

3159 (self.visit, key), 

3160 (self.op, ':'), 

3161 (self.visit, patterns[i]), 

3162 ]) 

3163 if rest: 

3164 result.extend([ 

3165 (self.op, '**'), 

3166 (self.name, rest), # A string. 

3167 ]) 

3168 result.append((self.op, '}')) 

3169 return result 

3170 #@+node:ekr.20220402124844.6: *7* iterative.MatchOr 

3171 # MatchOr(pattern* patterns) 

3172 

3173 def do_MatchOr(self, node: Node) -> ActionList: 

3174 

3175 patterns = getattr(node, 'patterns', []) 

3176 result: ActionList = [] 

3177 for i, pattern in enumerate(patterns): 

3178 if i > 0: 

3179 result.append((self.op, '|')) 

3180 result.append((self.visit, pattern)) 

3181 return result 

3182 #@+node:ekr.20220402124844.7: *7* iterative.MatchSequence 

3183 # MatchSequence(pattern* patterns) 

3184 

3185 def do_MatchSequence(self, node: Node) -> ActionList: 

3186 patterns = getattr(node, 'patterns', []) 

3187 result: ActionList = [] 

3188 # Scan for the next '(' or '[' token, skipping the 'case' token. 

3189 token = None 

3190 for token in self.tokens[self.px + 1 :]: 

3191 if token.kind == 'op' and token.value in '([': 

3192 break 

3193 if is_significant_token(token): 

3194 # An implicit tuple: there is no '(' or '[' token. 

3195 token = None 

3196 break 

3197 else: 

3198 raise AssignLinksError('Ill-formed tuple') # pragma: no cover 

3199 if token: 

3200 result.append((self.op, token.value)) 

3201 for i, pattern in enumerate(patterns): 

3202 result.append((self.visit, pattern)) 

3203 if token: 

3204 val = ']' if token.value == '[' else ')' 

3205 result.append((self.op, val)) 

3206 return result 

3207 #@+node:ekr.20220402124844.8: *7* iterative.MatchSingleton 

3208 # MatchSingleton(constant value) 

3209 

3210 def do_MatchSingleton(self, node: Node) -> ActionList: 

3211 """Match True, False or None.""" 

3212 return [ 

3213 (self.token, ('name', repr(node.value))), 

3214 ] 

3215 #@+node:ekr.20220402124844.9: *7* iterative.MatchStar 

3216 # MatchStar(identifier? name) 

3217 

3218 def do_MatchStar(self, node: Node) -> ActionList: 

3219 

3220 name = getattr(node, 'name', None) 

3221 result: ActionList = [ 

3222 (self.op, '*'), 

3223 ] 

3224 if name: 

3225 result.append((self.name, name)) 

3226 return result 

3227 #@+node:ekr.20220402124844.10: *7* iterative.MatchValue 

3228 # MatchValue(expr value) 

3229 

3230 def do_MatchValue(self, node: Node) -> ActionList: 

3231 

3232 return [ 

3233 (self.visit, node.value), 

3234 ] 

3235 #@+node:ekr.20220330133336.78: *6* iterative.Nonlocal 

3236 # Nonlocal(identifier* names) 

3237 

3238 def do_Nonlocal(self, node: Node) -> ActionList: 

3239 

3240 # nonlocal %s\n' % ','.join(node.names)) 

3241 # No need to put commas. 

3242 result: ActionList = [ 

3243 (self.name, 'nonlocal'), 

3244 ] 

3245 for z in node.names: 

3246 result.append((self.name, z)) 

3247 return result 

3248 #@+node:ekr.20220330133336.79: *6* iterative.Pass 

3249 def do_Pass(self, node: Node) -> ActionList: 

3250 

3251 return ([ 

3252 (self.name, 'pass'), 

3253 ]) 

3254 #@+node:ekr.20220330133336.80: *6* iterative.Raise 

3255 # Raise(expr? exc, expr? cause) 

3256 

3257 def do_Raise(self, node: Node) -> ActionList: 

3258 

3259 # No need to put commas. 

3260 exc = getattr(node, 'exc', None) 

3261 cause = getattr(node, 'cause', None) 

3262 tback = getattr(node, 'tback', None) 

3263 result: ActionList = [ 

3264 (self.name, 'raise'), 

3265 (self.visit, exc), 

3266 ] 

3267 if cause: 

3268 result.extend([ 

3269 (self.name, 'from'), # #2446. 

3270 (self.visit, cause), 

3271 ]) 

3272 result.append((self.visit, tback)) 

3273 return result 

3274 

3275 #@+node:ekr.20220330133336.81: *6* iterative.Return 

3276 def do_Return(self, node: Node) -> ActionList: 

3277 

3278 return [ 

3279 (self.name, 'return'), 

3280 (self.visit, node.value), 

3281 ] 

3282 #@+node:ekr.20220330133336.82: *6* iterative.Try 

3283 # Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) 

3284 

3285 def do_Try(self, node: Node) -> ActionList: 

3286 

3287 result: ActionList = [ 

3288 # Try line... 

3289 (self.name, 'try'), 

3290 (self.op, ':'), 

3291 # Body... 

3292 (self.visit, node.body), 

3293 (self.visit, node.handlers), 

3294 ] 

3295 # Else... 

3296 if node.orelse: 

3297 result.extend([ 

3298 (self.name, 'else'), 

3299 (self.op, ':'), 

3300 (self.visit, node.orelse), 

3301 ]) 

3302 # Finally... 

3303 if node.finalbody: 

3304 result.extend([ 

3305 (self.name, 'finally'), 

3306 (self.op, ':'), 

3307 (self.visit, node.finalbody), 

3308 ]) 

3309 return result 

3310 #@+node:ekr.20220330133336.83: *6* iterative.While 

3311 def do_While(self, node: Node) -> ActionList: 

3312 

3313 # While line... 

3314 # while %s:\n' 

3315 result: ActionList = [ 

3316 (self.name, 'while'), 

3317 (self.visit, node.test), 

3318 (self.op, ':'), 

3319 # Body... 

3320 (self.visit, node.body), 

3321 ] 

3322 # Else clause... 

3323 if node.orelse: 

3324 result.extend([ 

3325 (self.name, 'else'), 

3326 (self.op, ':'), 

3327 (self.visit, node.orelse), 

3328 ]) 

3329 return result 

3330 #@+node:ekr.20220330133336.84: *6* iterative.With 

3331 # With(withitem* items, stmt* body) 

3332 

3333 # withitem = (expr context_expr, expr? optional_vars) 

3334 

3335 def do_With(self, node: Node) -> ActionList: 

3336 

3337 expr: Optional[ast.AST] = getattr(node, 'context_expression', None) 

3338 items: List[ast.AST] = getattr(node, 'items', []) 

3339 result: ActionList = [ 

3340 (self.name, 'with'), 

3341 (self.visit, expr), 

3342 ] 

3343 # No need to put commas. 

3344 for item in items: 

3345 result.append((self.visit, item.context_expr)) 

3346 optional_vars = getattr(item, 'optional_vars', None) 

3347 if optional_vars is not None: 

3348 result.extend([ 

3349 (self.name, 'as'), 

3350 (self.visit, item.optional_vars), 

3351 ]) 

3352 result.extend([ 

3353 # End the line. 

3354 (self.op, ':'), 

3355 # Body... 

3356 (self.visit, node.body), 

3357 ]) 

3358 return result 

3359 #@+node:ekr.20220330133336.85: *6* iterative.Yield 

3360 def do_Yield(self, node: Node) -> ActionList: 

3361 

3362 result: ActionList = [ 

3363 (self.name, 'yield'), 

3364 ] 

3365 if hasattr(node, 'value'): 

3366 result.extend([ 

3367 (self.visit, node.value), 

3368 ]) 

3369 return result 

3370 #@+node:ekr.20220330133336.86: *6* iterative.YieldFrom 

3371 # YieldFrom(expr value) 

3372 

3373 def do_YieldFrom(self, node: Node) -> ActionList: 

3374 

3375 return ([ 

3376 (self.name, 'yield'), 

3377 (self.name, 'from'), 

3378 (self.visit, node.value), 

3379 ]) 

3380 #@-others 

3381#@+node:ekr.20200107165250.1: *3* class Orange 

3382class Orange: 

3383 """ 

3384 A flexible and powerful beautifier for Python. 

3385 Orange is the new black. 

3386 

3387 *Important*: This is a predominantly a *token*-based beautifier. 

3388 However, orange.colon and orange.possible_unary_op use the parse 

3389 tree to provide context that would otherwise be difficult to 

3390 deduce. 

3391 """ 

3392 # This switch is really a comment. It will always be false. 

3393 # It marks the code that simulates the operation of the black tool. 

3394 black_mode = False 

3395 

3396 # Patterns... 

3397 nobeautify_pat = re.compile(r'\s*#\s*pragma:\s*no\s*beautify\b|#\s*@@nobeautify') 

3398 

3399 # Patterns from FastAtRead class, specialized for python delims. 

3400 node_pat = re.compile(r'^(\s*)#@\+node:([^:]+): \*(\d+)?(\*?) (.*)$') # @node 

3401 start_doc_pat = re.compile(r'^\s*#@\+(at|doc)?(\s.*?)?$') # @doc or @ 

3402 at_others_pat = re.compile(r'^(\s*)#@(\+|-)others\b(.*)$') # @others 

3403 

3404 # Doc parts end with @c or a node sentinel. Specialized for python. 

3405 end_doc_pat = re.compile(r"^\s*#@(@(c(ode)?)|([+]node\b.*))$") 

3406 #@+others 

3407 #@+node:ekr.20200107165250.2: *4* orange.ctor 

3408 def __init__(self, settings: Optional[Dict[str, Any]]=None): 

3409 """Ctor for Orange class.""" 

3410 if settings is None: 

3411 settings = {} 

3412 valid_keys = ( 

3413 'allow_joined_strings', 

3414 'max_join_line_length', 

3415 'max_split_line_length', 

3416 'orange', 

3417 'tab_width', 

3418 ) 

3419 # For mypy... 

3420 self.kind: str = '' 

3421 # Default settings... 

3422 self.allow_joined_strings = False # EKR's preference. 

3423 self.max_join_line_length = 88 

3424 self.max_split_line_length = 88 

3425 self.tab_width = 4 

3426 # Override from settings dict... 

3427 for key in settings: # pragma: no cover 

3428 value = settings.get(key) 

3429 if key in valid_keys and value is not None: 

3430 setattr(self, key, value) 

3431 else: 

3432 g.trace(f"Unexpected setting: {key} = {value!r}") 

3433 #@+node:ekr.20200107165250.51: *4* orange.push_state 

3434 def push_state(self, kind: str, value: Any=None) -> None: 

3435 """Append a state to the state stack.""" 

3436 state = ParseState(kind, value) 

3437 self.state_stack.append(state) 

3438 #@+node:ekr.20200107165250.8: *4* orange: Entries 

3439 #@+node:ekr.20200107173542.1: *5* orange.beautify (main token loop) 

3440 def oops(self) -> None: # pragma: no cover 

3441 g.trace(f"Unknown kind: {self.kind}") 

3442 

3443 def beautify(self, contents: str, filename: str, tokens: List["Token"], tree: Node, 

3444 

3445 max_join_line_length: Optional[int]=None, max_split_line_length: Optional[int]=None, 

3446 ) -> str: 

3447 """ 

3448 The main line. Create output tokens and return the result as a string. 

3449 """ 

3450 # Config overrides 

3451 if max_join_line_length is not None: 

3452 self.max_join_line_length = max_join_line_length 

3453 if max_split_line_length is not None: 

3454 self.max_split_line_length = max_split_line_length 

3455 # State vars... 

3456 self.curly_brackets_level = 0 # Number of unmatched '{' tokens. 

3457 self.decorator_seen = False # Set by do_name for do_op. 

3458 self.in_arg_list = 0 # > 0 if in an arg list of a def. 

3459 self.level = 0 # Set only by do_indent and do_dedent. 

3460 self.lws = '' # Leading whitespace. 

3461 self.paren_level = 0 # Number of unmatched '(' tokens. 

3462 self.square_brackets_stack: List[bool] = [] # A stack of bools, for self.word(). 

3463 self.state_stack: List["ParseState"] = [] # Stack of ParseState objects. 

3464 self.val = None # The input token's value (a string). 

3465 self.verbatim = False # True: don't beautify. 

3466 # 

3467 # Init output list and state... 

3468 self.code_list: List[Token] = [] # The list of output tokens. 

3469 self.code_list_index = 0 # The token's index. 

3470 self.tokens = tokens # The list of input tokens. 

3471 self.tree = tree 

3472 self.add_token('file-start', '') 

3473 self.push_state('file-start') 

3474 for i, token in enumerate(tokens): 

3475 self.token = token 

3476 self.kind, self.val, self.line = token.kind, token.value, token.line 

3477 if self.verbatim: 

3478 self.do_verbatim() 

3479 else: 

3480 func = getattr(self, f"do_{token.kind}", self.oops) 

3481 func() 

3482 # Any post pass would go here. 

3483 return tokens_to_string(self.code_list) 

3484 #@+node:ekr.20200107172450.1: *5* orange.beautify_file (entry) 

3485 def beautify_file(self, filename: str) -> bool: # pragma: no cover 

3486 """ 

3487 Orange: Beautify the the given external file. 

3488 

3489 Return True if the file was changed. 

3490 """ 

3491 self.filename = filename 

3492 tog = TokenOrderGenerator() 

3493 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3494 if not contents or not tokens or not tree: 

3495 return False # #2529: Not an error. 

3496 # Beautify. 

3497 try: 

3498 results = self.beautify(contents, filename, tokens, tree) 

3499 except BeautifyError: 

3500 return False # #2578. 

3501 # Something besides newlines must change. 

3502 if regularize_nls(contents) == regularize_nls(results): 

3503 return False 

3504 if 0: # This obscures more import error messages. 

3505 show_diffs(contents, results, filename=filename) 

3506 # Write the results 

3507 print(f"Beautified: {g.shortFileName(filename)}") 

3508 write_file(filename, results, encoding=encoding) 

3509 return True 

3510 #@+node:ekr.20200107172512.1: *5* orange.beautify_file_diff (entry) 

3511 def beautify_file_diff(self, filename: str) -> bool: # pragma: no cover 

3512 """ 

3513 Orange: Print the diffs that would resulf from the orange-file command. 

3514 

3515 Return True if the file would be changed. 

3516 """ 

3517 tag = 'diff-beautify-file' 

3518 self.filename = filename 

3519 tog = TokenOrderGenerator() 

3520 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3521 if not contents or not tokens or not tree: 

3522 print(f"{tag}: Can not beautify: {filename}") 

3523 return False 

3524 # fstringify. 

3525 results = self.beautify(contents, filename, tokens, tree) 

3526 # Something besides newlines must change. 

3527 if regularize_nls(contents) == regularize_nls(results): 

3528 print(f"{tag}: Unchanged: {filename}") 

3529 return False 

3530 # Show the diffs. 

3531 show_diffs(contents, results, filename=filename) 

3532 return True 

3533 #@+node:ekr.20200107165250.13: *4* orange: Input token handlers 

3534 #@+node:ekr.20200107165250.14: *5* orange.do_comment 

3535 in_doc_part = False 

3536 

3537 def do_comment(self) -> None: 

3538 """Handle a comment token.""" 

3539 val = self.val 

3540 # 

3541 # Leo-specific code... 

3542 if self.node_pat.match(val): 

3543 # Clear per-node state. 

3544 self.in_doc_part = False 

3545 self.verbatim = False 

3546 self.decorator_seen = False 

3547 # Do *not clear other state, which may persist across @others. 

3548 # self.curly_brackets_level = 0 

3549 # self.in_arg_list = 0 

3550 # self.level = 0 

3551 # self.lws = '' 

3552 # self.paren_level = 0 

3553 # self.square_brackets_stack = [] 

3554 # self.state_stack = [] 

3555 else: 

3556 # Keep track of verbatim mode. 

3557 if self.beautify_pat.match(val): 

3558 self.verbatim = False 

3559 elif self.nobeautify_pat.match(val): 

3560 self.verbatim = True 

3561 # Keep trace of @doc parts, to honor the convention for splitting lines. 

3562 if self.start_doc_pat.match(val): 

3563 self.in_doc_part = True 

3564 if self.end_doc_pat.match(val): 

3565 self.in_doc_part = False 

3566 # 

3567 # General code: Generate the comment. 

3568 self.clean('blank') 

3569 entire_line = self.line.lstrip().startswith('#') 

3570 if entire_line: 

3571 self.clean('hard-blank') 

3572 self.clean('line-indent') 

3573 # #1496: No further munging needed. 

3574 val = self.line.rstrip() 

3575 else: 

3576 # Exactly two spaces before trailing comments. 

3577 val = ' ' + self.val.rstrip() 

3578 self.add_token('comment', val) 

3579 #@+node:ekr.20200107165250.15: *5* orange.do_encoding 

3580 def do_encoding(self) -> None: 

3581 """ 

3582 Handle the encoding token. 

3583 """ 

3584 pass 

3585 #@+node:ekr.20200107165250.16: *5* orange.do_endmarker 

3586 def do_endmarker(self) -> None: 

3587 """Handle an endmarker token.""" 

3588 # Ensure exactly one blank at the end of the file. 

3589 self.clean_blank_lines() 

3590 self.add_token('line-end', '\n') 

3591 #@+node:ekr.20200107165250.18: *5* orange.do_indent & do_dedent & helper 

3592 # Note: other methods use self.level. 

3593 

3594 def do_dedent(self) -> None: 

3595 """Handle dedent token.""" 

3596 self.level -= 1 

3597 self.lws = self.level * self.tab_width * ' ' 

3598 self.line_indent() 

3599 if self.black_mode: # pragma: no cover (black) 

3600 state = self.state_stack[-1] 

3601 if state.kind == 'indent' and state.value == self.level: 

3602 self.state_stack.pop() 

3603 state = self.state_stack[-1] 

3604 if state.kind in ('class', 'def'): 

3605 self.state_stack.pop() 

3606 self.handle_dedent_after_class_or_def(state.kind) 

3607 

3608 def do_indent(self) -> None: 

3609 """Handle indent token.""" 

3610 # #2578: Refuse to beautify files containing leading tabs or unusual indentation. 

3611 consider_message = 'consider using python/Tools/scripts/reindent.py' 

3612 if '\t' in self.val: 

3613 message = f"Leading tabs found: {self.filename}" 

3614 print(message) 

3615 print(consider_message) 

3616 raise BeautifyError(message) 

3617 if (len(self.val) % self.tab_width) != 0: 

3618 message = f" Indentation error: {self.filename}" 

3619 print(message) 

3620 print(consider_message) 

3621 raise BeautifyError(message) 

3622 new_indent = self.val 

3623 old_indent = self.level * self.tab_width * ' ' 

3624 if new_indent > old_indent: 

3625 self.level += 1 

3626 elif new_indent < old_indent: # pragma: no cover (defensive) 

3627 g.trace('\n===== can not happen', repr(new_indent), repr(old_indent)) 

3628 self.lws = new_indent 

3629 self.line_indent() 

3630 #@+node:ekr.20200220054928.1: *6* orange.handle_dedent_after_class_or_def 

3631 def handle_dedent_after_class_or_def(self, kind: str) -> None: # pragma: no cover (black) 

3632 """ 

3633 Insert blank lines after a class or def as the result of a 'dedent' token. 

3634 

3635 Normal comment lines may precede the 'dedent'. 

3636 Insert the blank lines *before* such comment lines. 

3637 """ 

3638 # 

3639 # Compute the tail. 

3640 i = len(self.code_list) - 1 

3641 tail: List[Token] = [] 

3642 while i > 0: 

3643 t = self.code_list.pop() 

3644 i -= 1 

3645 if t.kind == 'line-indent': 

3646 pass 

3647 elif t.kind == 'line-end': 

3648 tail.insert(0, t) 

3649 elif t.kind == 'comment': 

3650 # Only underindented single-line comments belong in the tail. 

3651 # @+node comments must never be in the tail. 

3652 single_line = self.code_list[i].kind in ('line-end', 'line-indent') 

3653 lws = len(t.value) - len(t.value.lstrip()) 

3654 underindent = lws <= len(self.lws) 

3655 if underindent and single_line and not self.node_pat.match(t.value): 

3656 # A single-line comment. 

3657 tail.insert(0, t) 

3658 else: 

3659 self.code_list.append(t) 

3660 break 

3661 else: 

3662 self.code_list.append(t) 

3663 break 

3664 # 

3665 # Remove leading 'line-end' tokens from the tail. 

3666 while tail and tail[0].kind == 'line-end': 

3667 tail = tail[1:] 

3668 # 

3669 # Put the newlines *before* the tail. 

3670 # For Leo, always use 1 blank lines. 

3671 n = 1 # n = 2 if kind == 'class' else 1 

3672 # Retain the token (intention) for debugging. 

3673 self.add_token('blank-lines', n) 

3674 for i in range(0, n + 1): 

3675 self.add_token('line-end', '\n') 

3676 if tail: 

3677 self.code_list.extend(tail) 

3678 self.line_indent() 

3679 #@+node:ekr.20200107165250.20: *5* orange.do_name 

3680 def do_name(self) -> None: 

3681 """Handle a name token.""" 

3682 name = self.val 

3683 if self.black_mode and name in ('class', 'def'): # pragma: no cover (black) 

3684 # Handle newlines before and after 'class' or 'def' 

3685 self.decorator_seen = False 

3686 state = self.state_stack[-1] 

3687 if state.kind == 'decorator': 

3688 # Always do this, regardless of @bool clean-blank-lines. 

3689 self.clean_blank_lines() 

3690 # Suppress split/join. 

3691 self.add_token('hard-newline', '\n') 

3692 self.add_token('line-indent', self.lws) 

3693 self.state_stack.pop() 

3694 else: 

3695 # Always do this, regardless of @bool clean-blank-lines. 

3696 self.blank_lines(2 if name == 'class' else 1) 

3697 self.push_state(name) 

3698 # For trailing lines after inner classes/defs. 

3699 self.push_state('indent', self.level) 

3700 self.word(name) 

3701 return 

3702 # 

3703 # Leo mode... 

3704 if name in ('class', 'def'): 

3705 self.word(name) 

3706 elif name in ( 

3707 'and', 'elif', 'else', 'for', 'if', 'in', 'not', 'not in', 'or', 'while' 

3708 ): 

3709 self.word_op(name) 

3710 else: 

3711 self.word(name) 

3712 #@+node:ekr.20200107165250.21: *5* orange.do_newline & do_nl 

3713 def do_newline(self) -> None: 

3714 """Handle a regular newline.""" 

3715 self.line_end() 

3716 

3717 def do_nl(self) -> None: 

3718 """Handle a continuation line.""" 

3719 self.line_end() 

3720 #@+node:ekr.20200107165250.22: *5* orange.do_number 

3721 def do_number(self) -> None: 

3722 """Handle a number token.""" 

3723 self.blank() 

3724 self.add_token('number', self.val) 

3725 #@+node:ekr.20200107165250.23: *5* orange.do_op 

3726 def do_op(self) -> None: 

3727 """Handle an op token.""" 

3728 val = self.val 

3729 if val == '.': 

3730 self.clean('blank') 

3731 prev = self.code_list[-1] 

3732 # #2495 & #2533: Special case for 'from .' 

3733 if prev.kind == 'word' and prev.value == 'from': 

3734 self.blank() 

3735 self.add_token('op-no-blanks', val) 

3736 elif val == '@': 

3737 if self.black_mode: # pragma: no cover (black) 

3738 if not self.decorator_seen: 

3739 self.blank_lines(1) 

3740 self.decorator_seen = True 

3741 self.clean('blank') 

3742 self.add_token('op-no-blanks', val) 

3743 self.push_state('decorator') 

3744 elif val == ':': 

3745 # Treat slices differently. 

3746 self.colon(val) 

3747 elif val in ',;': 

3748 # Pep 8: Avoid extraneous whitespace immediately before 

3749 # comma, semicolon, or colon. 

3750 self.clean('blank') 

3751 self.add_token('op', val) 

3752 self.blank() 

3753 elif val in '([{': 

3754 # Pep 8: Avoid extraneous whitespace immediately inside 

3755 # parentheses, brackets or braces. 

3756 self.lt(val) 

3757 elif val in ')]}': 

3758 # Ditto. 

3759 self.rt(val) 

3760 elif val == '=': 

3761 # Pep 8: Don't use spaces around the = sign when used to indicate 

3762 # a keyword argument or a default parameter value. 

3763 if self.paren_level: 

3764 self.clean('blank') 

3765 self.add_token('op-no-blanks', val) 

3766 else: 

3767 self.blank() 

3768 self.add_token('op', val) 

3769 self.blank() 

3770 elif val in '~+-': 

3771 self.possible_unary_op(val) 

3772 elif val == '*': 

3773 self.star_op() 

3774 elif val == '**': 

3775 self.star_star_op() 

3776 else: 

3777 # Pep 8: always surround binary operators with a single space. 

3778 # '==','+=','-=','*=','**=','/=','//=','%=','!=','<=','>=','<','>', 

3779 # '^','~','*','**','&','|','/','//', 

3780 # Pep 8: If operators with different priorities are used, 

3781 # consider adding whitespace around the operators with the lowest priority(ies). 

3782 self.blank() 

3783 self.add_token('op', val) 

3784 self.blank() 

3785 #@+node:ekr.20200107165250.24: *5* orange.do_string 

3786 def do_string(self) -> None: 

3787 """Handle a 'string' token.""" 

3788 # Careful: continued strings may contain '\r' 

3789 val = regularize_nls(self.val) 

3790 self.add_token('string', val) 

3791 self.blank() 

3792 #@+node:ekr.20200210175117.1: *5* orange.do_verbatim 

3793 beautify_pat = re.compile( 

3794 r'#\s*pragma:\s*beautify\b|#\s*@@beautify|#\s*@\+node|#\s*@[+-]others|#\s*@[+-]<<') 

3795 

3796 def do_verbatim(self) -> None: 

3797 """ 

3798 Handle one token in verbatim mode. 

3799 End verbatim mode when the appropriate comment is seen. 

3800 """ 

3801 kind = self.kind 

3802 # 

3803 # Careful: tokens may contain '\r' 

3804 val = regularize_nls(self.val) 

3805 if kind == 'comment': 

3806 if self.beautify_pat.match(val): 

3807 self.verbatim = False 

3808 val = val.rstrip() 

3809 self.add_token('comment', val) 

3810 return 

3811 if kind == 'indent': 

3812 self.level += 1 

3813 self.lws = self.level * self.tab_width * ' ' 

3814 if kind == 'dedent': 

3815 self.level -= 1 

3816 self.lws = self.level * self.tab_width * ' ' 

3817 self.add_token('verbatim', val) 

3818 #@+node:ekr.20200107165250.25: *5* orange.do_ws 

3819 def do_ws(self) -> None: 

3820 """ 

3821 Handle the "ws" pseudo-token. 

3822 

3823 Put the whitespace only if if ends with backslash-newline. 

3824 """ 

3825 val = self.val 

3826 # Handle backslash-newline. 

3827 if '\\\n' in val: 

3828 self.clean('blank') 

3829 self.add_token('op-no-blanks', val) 

3830 return 

3831 # Handle start-of-line whitespace. 

3832 prev = self.code_list[-1] 

3833 inner = self.paren_level or self.square_brackets_stack or self.curly_brackets_level 

3834 if prev.kind == 'line-indent' and inner: 

3835 # Retain the indent that won't be cleaned away. 

3836 self.clean('line-indent') 

3837 self.add_token('hard-blank', val) 

3838 #@+node:ekr.20200107165250.26: *4* orange: Output token generators 

3839 #@+node:ekr.20200118145044.1: *5* orange.add_line_end 

3840 def add_line_end(self) -> "Token": 

3841 """Add a line-end request to the code list.""" 

3842 # This may be called from do_name as well as do_newline and do_nl. 

3843 assert self.token.kind in ('newline', 'nl'), self.token.kind 

3844 self.clean('blank') # Important! 

3845 self.clean('line-indent') 

3846 t = self.add_token('line-end', '\n') 

3847 # Distinguish between kinds of 'line-end' tokens. 

3848 t.newline_kind = self.token.kind 

3849 return t 

3850 #@+node:ekr.20200107170523.1: *5* orange.add_token 

3851 def add_token(self, kind: str, value: Any) -> "Token": 

3852 """Add an output token to the code list.""" 

3853 tok = Token(kind, value) 

3854 tok.index = self.code_list_index # For debugging only. 

3855 self.code_list_index += 1 

3856 self.code_list.append(tok) 

3857 return tok 

3858 #@+node:ekr.20200107165250.27: *5* orange.blank 

3859 def blank(self) -> None: 

3860 """Add a blank request to the code list.""" 

3861 prev = self.code_list[-1] 

3862 if prev.kind not in ( 

3863 'blank', 

3864 'blank-lines', 

3865 'file-start', 

3866 'hard-blank', # Unique to orange. 

3867 'line-end', 

3868 'line-indent', 

3869 'lt', 

3870 'op-no-blanks', 

3871 'unary-op', 

3872 ): 

3873 self.add_token('blank', ' ') 

3874 #@+node:ekr.20200107165250.29: *5* orange.blank_lines (black only) 

3875 def blank_lines(self, n: int) -> None: # pragma: no cover (black) 

3876 """ 

3877 Add a request for n blank lines to the code list. 

3878 Multiple blank-lines request yield at least the maximum of all requests. 

3879 """ 

3880 self.clean_blank_lines() 

3881 prev = self.code_list[-1] 

3882 if prev.kind == 'file-start': 

3883 self.add_token('blank-lines', n) 

3884 return 

3885 for i in range(0, n + 1): 

3886 self.add_token('line-end', '\n') 

3887 # Retain the token (intention) for debugging. 

3888 self.add_token('blank-lines', n) 

3889 self.line_indent() 

3890 #@+node:ekr.20200107165250.30: *5* orange.clean 

3891 def clean(self, kind: str) -> None: 

3892 """Remove the last item of token list if it has the given kind.""" 

3893 prev = self.code_list[-1] 

3894 if prev.kind == kind: 

3895 self.code_list.pop() 

3896 #@+node:ekr.20200107165250.31: *5* orange.clean_blank_lines 

3897 def clean_blank_lines(self) -> bool: 

3898 """ 

3899 Remove all vestiges of previous blank lines. 

3900 

3901 Return True if any of the cleaned 'line-end' tokens represented "hard" newlines. 

3902 """ 

3903 cleaned_newline = False 

3904 table = ('blank-lines', 'line-end', 'line-indent') 

3905 while self.code_list[-1].kind in table: 

3906 t = self.code_list.pop() 

3907 if t.kind == 'line-end' and getattr(t, 'newline_kind', None) != 'nl': 

3908 cleaned_newline = True 

3909 return cleaned_newline 

3910 #@+node:ekr.20200107165250.32: *5* orange.colon 

3911 def colon(self, val: str) -> None: 

3912 """Handle a colon.""" 

3913 

3914 def is_expr(node: Node) -> bool: 

3915 """True if node is any expression other than += number.""" 

3916 if isinstance(node, (ast.BinOp, ast.Call, ast.IfExp)): 

3917 return True 

3918 return ( 

3919 isinstance(node, ast.UnaryOp) 

3920 and not isinstance(node.operand, ast.Num) 

3921 ) 

3922 

3923 node = self.token.node 

3924 self.clean('blank') 

3925 if not isinstance(node, ast.Slice): 

3926 self.add_token('op', val) 

3927 self.blank() 

3928 return 

3929 # A slice. 

3930 lower = getattr(node, 'lower', None) 

3931 upper = getattr(node, 'upper', None) 

3932 step = getattr(node, 'step', None) 

3933 if any(is_expr(z) for z in (lower, upper, step)): 

3934 prev = self.code_list[-1] 

3935 if prev.value not in '[:': 

3936 self.blank() 

3937 self.add_token('op', val) 

3938 self.blank() 

3939 else: 

3940 self.add_token('op-no-blanks', val) 

3941 #@+node:ekr.20200107165250.33: *5* orange.line_end 

3942 def line_end(self) -> None: 

3943 """Add a line-end request to the code list.""" 

3944 # This should be called only be do_newline and do_nl. 

3945 node, token = self.token.statement_node, self.token 

3946 assert token.kind in ('newline', 'nl'), (token.kind, g.callers()) 

3947 # Create the 'line-end' output token. 

3948 self.add_line_end() 

3949 # Attempt to split the line. 

3950 was_split = self.split_line(node, token) 

3951 # Attempt to join the line only if it has not just been split. 

3952 if not was_split and self.max_join_line_length > 0: 

3953 self.join_lines(node, token) 

3954 # Add the indentation for all lines 

3955 # until the next indent or unindent token. 

3956 self.line_indent() 

3957 #@+node:ekr.20200107165250.40: *5* orange.line_indent 

3958 def line_indent(self) -> None: 

3959 """Add a line-indent token.""" 

3960 self.clean('line-indent') # Defensive. Should never happen. 

3961 self.add_token('line-indent', self.lws) 

3962 #@+node:ekr.20200107165250.41: *5* orange.lt & rt 

3963 #@+node:ekr.20200107165250.42: *6* orange.lt 

3964 def lt(self, val: str) -> None: 

3965 """Generate code for a left paren or curly/square bracket.""" 

3966 assert val in '([{', repr(val) 

3967 if val == '(': 

3968 self.paren_level += 1 

3969 elif val == '[': 

3970 self.square_brackets_stack.append(False) 

3971 else: 

3972 self.curly_brackets_level += 1 

3973 self.clean('blank') 

3974 prev = self.code_list[-1] 

3975 if prev.kind in ('op', 'word-op'): 

3976 self.blank() 

3977 self.add_token('lt', val) 

3978 elif prev.kind == 'word': 

3979 # Only suppress blanks before '(' or '[' for non-keyworks. 

3980 if val == '{' or prev.value in ('if', 'else', 'return', 'for'): 

3981 self.blank() 

3982 elif val == '(': 

3983 self.in_arg_list += 1 

3984 self.add_token('lt', val) 

3985 else: 

3986 self.clean('blank') 

3987 self.add_token('op-no-blanks', val) 

3988 #@+node:ekr.20200107165250.43: *6* orange.rt 

3989 def rt(self, val: str) -> None: 

3990 """Generate code for a right paren or curly/square bracket.""" 

3991 assert val in ')]}', repr(val) 

3992 if val == ')': 

3993 self.paren_level -= 1 

3994 self.in_arg_list = max(0, self.in_arg_list - 1) 

3995 elif val == ']': 

3996 self.square_brackets_stack.pop() 

3997 else: 

3998 self.curly_brackets_level -= 1 

3999 self.clean('blank') 

4000 self.add_token('rt', val) 

4001 #@+node:ekr.20200107165250.45: *5* orange.possible_unary_op & unary_op 

4002 def possible_unary_op(self, s: str) -> None: 

4003 """Add a unary or binary op to the token list.""" 

4004 node = self.token.node 

4005 self.clean('blank') 

4006 if isinstance(node, ast.UnaryOp): 

4007 self.unary_op(s) 

4008 else: 

4009 self.blank() 

4010 self.add_token('op', s) 

4011 self.blank() 

4012 

4013 def unary_op(self, s: str) -> None: 

4014 """Add an operator request to the code list.""" 

4015 assert s and isinstance(s, str), repr(s) 

4016 self.clean('blank') 

4017 prev = self.code_list[-1] 

4018 if prev.kind == 'lt': 

4019 self.add_token('unary-op', s) 

4020 else: 

4021 self.blank() 

4022 self.add_token('unary-op', s) 

4023 #@+node:ekr.20200107165250.46: *5* orange.star_op 

4024 def star_op(self) -> None: 

4025 """Put a '*' op, with special cases for *args.""" 

4026 val = '*' 

4027 node = self.token.node 

4028 self.clean('blank') 

4029 if isinstance(node, ast.arguments): 

4030 self.blank() 

4031 self.add_token('op', val) 

4032 return # #2533 

4033 if self.paren_level > 0: 

4034 prev = self.code_list[-1] 

4035 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','): 

4036 self.blank() 

4037 self.add_token('op', val) 

4038 return 

4039 self.blank() 

4040 self.add_token('op', val) 

4041 self.blank() 

4042 #@+node:ekr.20200107165250.47: *5* orange.star_star_op 

4043 def star_star_op(self) -> None: 

4044 """Put a ** operator, with a special case for **kwargs.""" 

4045 val = '**' 

4046 node = self.token.node 

4047 self.clean('blank') 

4048 if isinstance(node, ast.arguments): 

4049 self.blank() 

4050 self.add_token('op', val) 

4051 return # #2533 

4052 if self.paren_level > 0: 

4053 prev = self.code_list[-1] 

4054 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','): 

4055 self.blank() 

4056 self.add_token('op', val) 

4057 return 

4058 self.blank() 

4059 self.add_token('op', val) 

4060 self.blank() 

4061 #@+node:ekr.20200107165250.48: *5* orange.word & word_op 

4062 def word(self, s: str) -> None: 

4063 """Add a word request to the code list.""" 

4064 assert s and isinstance(s, str), repr(s) 

4065 node = self.token.node 

4066 if isinstance(node, ast.ImportFrom) and s == 'import': # #2533 

4067 self.clean('blank') 

4068 self.add_token('blank', ' ') 

4069 self.add_token('word', s) 

4070 elif self.square_brackets_stack: 

4071 # A previous 'op-no-blanks' token may cancel this blank. 

4072 self.blank() 

4073 self.add_token('word', s) 

4074 elif self.in_arg_list > 0: 

4075 self.add_token('word', s) 

4076 self.blank() 

4077 else: 

4078 self.blank() 

4079 self.add_token('word', s) 

4080 self.blank() 

4081 

4082 def word_op(self, s: str) -> None: 

4083 """Add a word-op request to the code list.""" 

4084 assert s and isinstance(s, str), repr(s) 

4085 self.blank() 

4086 self.add_token('word-op', s) 

4087 self.blank() 

4088 #@+node:ekr.20200118120049.1: *4* orange: Split/join 

4089 #@+node:ekr.20200107165250.34: *5* orange.split_line & helpers 

4090 def split_line(self, node: Node, token: "Token") -> bool: 

4091 """ 

4092 Split token's line, if possible and enabled. 

4093 

4094 Return True if the line was broken into two or more lines. 

4095 """ 

4096 assert token.kind in ('newline', 'nl'), repr(token) 

4097 # Return if splitting is disabled: 

4098 if self.max_split_line_length <= 0: # pragma: no cover (user option) 

4099 return False 

4100 # Return if the node can't be split. 

4101 if not is_long_statement(node): 

4102 return False 

4103 # Find the *output* tokens of the previous lines. 

4104 line_tokens = self.find_prev_line() 

4105 line_s = ''.join([z.to_string() for z in line_tokens]) 

4106 # Do nothing for short lines. 

4107 if len(line_s) < self.max_split_line_length: 

4108 return False 

4109 # Return if the previous line has no opening delim: (, [ or {. 

4110 if not any(z.kind == 'lt' for z in line_tokens): # pragma: no cover (defensive) 

4111 return False 

4112 prefix = self.find_line_prefix(line_tokens) 

4113 # Calculate the tail before cleaning the prefix. 

4114 tail = line_tokens[len(prefix) :] 

4115 # Cut back the token list: subtract 1 for the trailing line-end. 

4116 self.code_list = self.code_list[: len(self.code_list) - len(line_tokens) - 1] 

4117 # Append the tail, splitting it further, as needed. 

4118 self.append_tail(prefix, tail) 

4119 # Add the line-end token deleted by find_line_prefix. 

4120 self.add_token('line-end', '\n') 

4121 return True 

4122 #@+node:ekr.20200107165250.35: *6* orange.append_tail 

4123 def append_tail(self, prefix: List["Token"], tail: List["Token"]) -> None: 

4124 """Append the tail tokens, splitting the line further as necessary.""" 

4125 tail_s = ''.join([z.to_string() for z in tail]) 

4126 if len(tail_s) < self.max_split_line_length: 

4127 # Add the prefix. 

4128 self.code_list.extend(prefix) 

4129 # Start a new line and increase the indentation. 

4130 self.add_token('line-end', '\n') 

4131 self.add_token('line-indent', self.lws + ' ' * 4) 

4132 self.code_list.extend(tail) 

4133 return 

4134 # Still too long. Split the line at commas. 

4135 self.code_list.extend(prefix) 

4136 # Start a new line and increase the indentation. 

4137 self.add_token('line-end', '\n') 

4138 self.add_token('line-indent', self.lws + ' ' * 4) 

4139 open_delim = Token(kind='lt', value=prefix[-1].value) 

4140 value = open_delim.value.replace('(', ')').replace('[', ']').replace('{', '}') 

4141 close_delim = Token(kind='rt', value=value) 

4142 delim_count = 1 

4143 lws = self.lws + ' ' * 4 

4144 for i, t in enumerate(tail): 

4145 if t.kind == 'op' and t.value == ',': 

4146 if delim_count == 1: 

4147 # Start a new line. 

4148 self.add_token('op-no-blanks', ',') 

4149 self.add_token('line-end', '\n') 

4150 self.add_token('line-indent', lws) 

4151 # Kill a following blank. 

4152 if i + 1 < len(tail): 

4153 next_t = tail[i + 1] 

4154 if next_t.kind == 'blank': 

4155 next_t.kind = 'no-op' 

4156 next_t.value = '' 

4157 else: 

4158 self.code_list.append(t) 

4159 elif t.kind == close_delim.kind and t.value == close_delim.value: 

4160 # Done if the delims match. 

4161 delim_count -= 1 

4162 if delim_count == 0: 

4163 # Start a new line 

4164 self.add_token('op-no-blanks', ',') 

4165 self.add_token('line-end', '\n') 

4166 self.add_token('line-indent', self.lws) 

4167 self.code_list.extend(tail[i:]) 

4168 return 

4169 lws = lws[:-4] 

4170 self.code_list.append(t) 

4171 elif t.kind == open_delim.kind and t.value == open_delim.value: 

4172 delim_count += 1 

4173 lws = lws + ' ' * 4 

4174 self.code_list.append(t) 

4175 else: 

4176 self.code_list.append(t) 

4177 g.trace('BAD DELIMS', delim_count) # pragma: no cover 

4178 #@+node:ekr.20200107165250.36: *6* orange.find_prev_line 

4179 def find_prev_line(self) -> List["Token"]: 

4180 """Return the previous line, as a list of tokens.""" 

4181 line = [] 

4182 for t in reversed(self.code_list[:-1]): 

4183 if t.kind in ('hard-newline', 'line-end'): 

4184 break 

4185 line.append(t) 

4186 return list(reversed(line)) 

4187 #@+node:ekr.20200107165250.37: *6* orange.find_line_prefix 

4188 def find_line_prefix(self, token_list: List["Token"]) -> List["Token"]: 

4189 """ 

4190 Return all tokens up to and including the first lt token. 

4191 Also add all lt tokens directly following the first lt token. 

4192 """ 

4193 result = [] 

4194 for i, t in enumerate(token_list): 

4195 result.append(t) 

4196 if t.kind == 'lt': 

4197 break 

4198 return result 

4199 #@+node:ekr.20200107165250.39: *5* orange.join_lines 

4200 def join_lines(self, node: Node, token: "Token") -> None: 

4201 """ 

4202 Join preceding lines, if possible and enabled. 

4203 token is a line_end token. node is the corresponding ast node. 

4204 """ 

4205 if self.max_join_line_length <= 0: # pragma: no cover (user option) 

4206 return 

4207 assert token.kind in ('newline', 'nl'), repr(token) 

4208 if token.kind == 'nl': 

4209 return 

4210 # Scan backward in the *code* list, 

4211 # looking for 'line-end' tokens with tok.newline_kind == 'nl' 

4212 nls = 0 

4213 i = len(self.code_list) - 1 

4214 t = self.code_list[i] 

4215 assert t.kind == 'line-end', repr(t) 

4216 # Not all tokens have a newline_kind ivar. 

4217 assert t.newline_kind == 'newline' 

4218 i -= 1 

4219 while i >= 0: 

4220 t = self.code_list[i] 

4221 if t.kind == 'comment': 

4222 # Can't join. 

4223 return 

4224 if t.kind == 'string' and not self.allow_joined_strings: 

4225 # An EKR preference: don't join strings, no matter what black does. 

4226 # This allows "short" f-strings to be aligned. 

4227 return 

4228 if t.kind == 'line-end': 

4229 if getattr(t, 'newline_kind', None) == 'nl': 

4230 nls += 1 

4231 else: 

4232 break # pragma: no cover 

4233 i -= 1 

4234 # Retain at the file-start token. 

4235 if i <= 0: 

4236 i = 1 

4237 if nls <= 0: # pragma: no cover (rare) 

4238 return 

4239 # Retain line-end and and any following line-indent. 

4240 # Required, so that the regex below won't eat too much. 

4241 while True: 

4242 t = self.code_list[i] 

4243 if t.kind == 'line-end': 

4244 if getattr(t, 'newline_kind', None) == 'nl': # pragma: no cover (rare) 

4245 nls -= 1 

4246 i += 1 

4247 elif self.code_list[i].kind == 'line-indent': 

4248 i += 1 

4249 else: 

4250 break # pragma: no cover (defensive) 

4251 if nls <= 0: # pragma: no cover (defensive) 

4252 return 

4253 # Calculate the joined line. 

4254 tail = self.code_list[i:] 

4255 tail_s = tokens_to_string(tail) 

4256 tail_s = re.sub(r'\n\s*', ' ', tail_s) 

4257 tail_s = tail_s.replace('( ', '(').replace(' )', ')') 

4258 tail_s = tail_s.rstrip() 

4259 # Don't join the lines if they would be too long. 

4260 if len(tail_s) > self.max_join_line_length: # pragma: no cover (defensive) 

4261 return 

4262 # Cut back the code list. 

4263 self.code_list = self.code_list[:i] 

4264 # Add the new output tokens. 

4265 self.add_token('string', tail_s) 

4266 self.add_token('line-end', '\n') 

4267 #@-others 

4268#@+node:ekr.20200107170126.1: *3* class ParseState 

4269class ParseState: 

4270 """ 

4271 A class representing items in the parse state stack. 

4272 

4273 The present states: 

4274 

4275 'file-start': Ensures the stack stack is never empty. 

4276 

4277 'decorator': The last '@' was a decorator. 

4278 

4279 do_op(): push_state('decorator') 

4280 do_name(): pops the stack if state.kind == 'decorator'. 

4281 

4282 'indent': The indentation level for 'class' and 'def' names. 

4283 

4284 do_name(): push_state('indent', self.level) 

4285 do_dendent(): pops the stack once or twice if state.value == self.level. 

4286 

4287 """ 

4288 

4289 def __init__(self, kind: str, value: str) -> None: 

4290 self.kind = kind 

4291 self.value = value 

4292 

4293 def __repr__(self) -> str: 

4294 return f"State: {self.kind} {self.value!r}" # pragma: no cover 

4295 

4296 __str__ = __repr__ 

4297#@+node:ekr.20191231084514.1: *3* class ReassignTokens 

4298class ReassignTokens: 

4299 """A class that reassigns tokens to more appropriate ast nodes.""" 

4300 #@+others 

4301 #@+node:ekr.20191231084640.1: *4* reassign.reassign 

4302 def reassign(self, filename: str, tokens: List["Token"], tree: Node) -> None: 

4303 """The main entry point.""" 

4304 self.filename = filename 

4305 self.tokens = tokens 

4306 # For now, just handle Call nodes. 

4307 for node in ast.walk(tree): 

4308 if isinstance(node, ast.Call): 

4309 self.visit_call(node) 

4310 #@+node:ekr.20191231084853.1: *4* reassign.visit_call 

4311 def visit_call(self, node: Node) -> None: 

4312 """ReassignTokens.visit_call""" 

4313 tokens = tokens_for_node(self.filename, node, self.tokens) 

4314 node0, node9 = tokens[0].node, tokens[-1].node 

4315 nca = nearest_common_ancestor(node0, node9) 

4316 if not nca: 

4317 return 

4318 # Associate () with the call node. 

4319 i = tokens[-1].index 

4320 j = find_paren_token(i + 1, self.tokens) 

4321 if j is None: 

4322 return # pragma: no cover 

4323 k = find_paren_token(j + 1, self.tokens) 

4324 if k is None: 

4325 return # pragma: no cover 

4326 self.tokens[j].node = nca 

4327 self.tokens[k].node = nca 

4328 add_token_to_token_list(self.tokens[j], nca) 

4329 add_token_to_token_list(self.tokens[k], nca) 

4330 #@-others 

4331#@+node:ekr.20191110080535.1: *3* class Token 

4332class Token: 

4333 """ 

4334 A class representing a 5-tuple, plus additional data. 

4335 """ 

4336 

4337 def __init__(self, kind: str, value: str): 

4338 

4339 self.kind = kind 

4340 self.value = value 

4341 # 

4342 # Injected by Tokenizer.add_token. 

4343 self.five_tuple = None 

4344 self.index = 0 

4345 # The entire line containing the token. 

4346 # Same as five_tuple.line. 

4347 self.line = '' 

4348 # The line number, for errors and dumps. 

4349 # Same as five_tuple.start[0] 

4350 self.line_number = 0 

4351 # 

4352 # Injected by Tokenizer.add_token. 

4353 self.level = 0 

4354 self.node: Optional[Node] = None 

4355 

4356 def __repr__(self) -> str: # pragma: no cover 

4357 nl_kind = getattr(self, 'newline_kind', '') 

4358 s = f"{self.kind:}.{self.index:<3}" 

4359 return f"{s:>18}:{nl_kind:7} {self.show_val(80)}" 

4360 

4361 def __str__(self) -> str: # pragma: no cover 

4362 nl_kind = getattr(self, 'newline_kind', '') 

4363 return f"{self.kind}.{self.index:<3}{nl_kind:8} {self.show_val(80)}" 

4364 

4365 def to_string(self) -> str: 

4366 """Return the contribution of the token to the source file.""" 

4367 return self.value if isinstance(self.value, str) else '' 

4368 #@+others 

4369 #@+node:ekr.20191231114927.1: *4* token.brief_dump 

4370 def brief_dump(self) -> str: # pragma: no cover 

4371 """Dump a token.""" 

4372 return ( 

4373 f"{self.index:>3} line: {self.line_number:<2} " 

4374 f"{self.kind:>11} {self.show_val(100)}") 

4375 #@+node:ekr.20200223022950.11: *4* token.dump 

4376 def dump(self) -> str: # pragma: no cover 

4377 """Dump a token and related links.""" 

4378 # Let block. 

4379 node_id = self.node.node_index if self.node else '' 

4380 node_cn = self.node.__class__.__name__ if self.node else '' 

4381 return ( 

4382 f"{self.line_number:4} " 

4383 f"{node_id:5} {node_cn:16} " 

4384 f"{self.index:>5} {self.kind:>11} " 

4385 f"{self.show_val(100)}") 

4386 #@+node:ekr.20200121081151.1: *4* token.dump_header 

4387 def dump_header(self) -> None: # pragma: no cover 

4388 """Print the header for token.dump""" 

4389 print( 

4390 f"\n" 

4391 f" node {'':10} token token\n" 

4392 f"line index class {'':10} index kind value\n" 

4393 f"==== ===== ===== {'':10} ===== ==== =====\n") 

4394 #@+node:ekr.20191116154328.1: *4* token.error_dump 

4395 def error_dump(self) -> str: # pragma: no cover 

4396 """Dump a token or result node for error message.""" 

4397 if self.node: 

4398 node_id = obj_id(self.node) 

4399 node_s = f"{node_id} {self.node.__class__.__name__}" 

4400 else: 

4401 node_s = "None" 

4402 return ( 

4403 f"index: {self.index:<3} {self.kind:>12} {self.show_val(20):<20} " 

4404 f"{node_s}") 

4405 #@+node:ekr.20191113095507.1: *4* token.show_val 

4406 def show_val(self, truncate_n: int) -> str: # pragma: no cover 

4407 """Return the token.value field.""" 

4408 if self.kind in ('ws', 'indent'): 

4409 val = str(len(self.value)) 

4410 elif self.kind == 'string': 

4411 # Important: don't add a repr for 'string' tokens. 

4412 # repr just adds another layer of confusion. 

4413 val = g.truncate(self.value, truncate_n) 

4414 else: 

4415 val = g.truncate(repr(self.value), truncate_n) 

4416 return val 

4417 #@-others 

4418#@+node:ekr.20191110165235.1: *3* class Tokenizer 

4419class Tokenizer: 

4420 

4421 """Create a list of Tokens from contents.""" 

4422 

4423 results: List[Token] = [] 

4424 

4425 #@+others 

4426 #@+node:ekr.20191110165235.2: *4* tokenizer.add_token 

4427 token_index = 0 

4428 prev_line_token = None 

4429 

4430 def add_token(self, kind: str, five_tuple: Any, line: str, s_row: int, value: str) -> None: 

4431 """ 

4432 Add a token to the results list. 

4433 

4434 Subclasses could override this method to filter out specific tokens. 

4435 """ 

4436 tok = Token(kind, value) 

4437 tok.five_tuple = five_tuple 

4438 tok.index = self.token_index 

4439 # Bump the token index. 

4440 self.token_index += 1 

4441 tok.line = line 

4442 tok.line_number = s_row 

4443 self.results.append(tok) 

4444 #@+node:ekr.20191110170551.1: *4* tokenizer.check_results 

4445 def check_results(self, contents: str) -> None: 

4446 

4447 # Split the results into lines. 

4448 result = ''.join([z.to_string() for z in self.results]) 

4449 result_lines = g.splitLines(result) 

4450 # Check. 

4451 ok = result == contents and result_lines == self.lines 

4452 assert ok, ( 

4453 f"\n" 

4454 f" result: {result!r}\n" 

4455 f" contents: {contents!r}\n" 

4456 f"result_lines: {result_lines}\n" 

4457 f" lines: {self.lines}" 

4458 ) 

4459 #@+node:ekr.20191110165235.3: *4* tokenizer.create_input_tokens 

4460 def create_input_tokens(self, contents: str, tokens: Generator) -> List["Token"]: 

4461 """ 

4462 Generate a list of Token's from tokens, a list of 5-tuples. 

4463 """ 

4464 # Create the physical lines. 

4465 self.lines = contents.splitlines(True) 

4466 # Create the list of character offsets of the start of each physical line. 

4467 last_offset, self.offsets = 0, [0] 

4468 for line in self.lines: 

4469 last_offset += len(line) 

4470 self.offsets.append(last_offset) 

4471 # Handle each token, appending tokens and between-token whitespace to results. 

4472 self.prev_offset, self.results = -1, [] 

4473 for token in tokens: 

4474 self.do_token(contents, token) 

4475 # Print results when tracing. 

4476 self.check_results(contents) 

4477 # Return results, as a list. 

4478 return self.results 

4479 #@+node:ekr.20191110165235.4: *4* tokenizer.do_token (the gem) 

4480 header_has_been_shown = False 

4481 

4482 def do_token(self, contents: str, five_tuple: Any) -> None: 

4483 """ 

4484 Handle the given token, optionally including between-token whitespace. 

4485 

4486 This is part of the "gem". 

4487 

4488 Links: 

4489 

4490 - 11/13/19: ENB: A much better untokenizer 

4491 https://groups.google.com/forum/#!msg/leo-editor/DpZ2cMS03WE/VPqtB9lTEAAJ 

4492 

4493 - Untokenize does not round-trip ws before bs-nl 

4494 https://bugs.python.org/issue38663 

4495 """ 

4496 import token as token_module 

4497 # Unpack.. 

4498 tok_type, val, start, end, line = five_tuple 

4499 s_row, s_col = start # row/col offsets of start of token. 

4500 e_row, e_col = end # row/col offsets of end of token. 

4501 kind = token_module.tok_name[tok_type].lower() 

4502 # Calculate the token's start/end offsets: character offsets into contents. 

4503 s_offset = self.offsets[max(0, s_row - 1)] + s_col 

4504 e_offset = self.offsets[max(0, e_row - 1)] + e_col 

4505 # tok_s is corresponding string in the line. 

4506 tok_s = contents[s_offset:e_offset] 

4507 # Add any preceding between-token whitespace. 

4508 ws = contents[self.prev_offset:s_offset] 

4509 if ws: 

4510 # No need for a hook. 

4511 self.add_token('ws', five_tuple, line, s_row, ws) 

4512 # Always add token, even if it contributes no text! 

4513 self.add_token(kind, five_tuple, line, s_row, tok_s) 

4514 # Update the ending offset. 

4515 self.prev_offset = e_offset 

4516 #@-others 

4517#@+node:ekr.20191113063144.1: *3* class TokenOrderGenerator 

4518class TokenOrderGenerator: 

4519 """ 

4520 A class that traverses ast (parse) trees in token order. 

4521 

4522 Overview: https://github.com/leo-editor/leo-editor/issues/1440#issue-522090981 

4523 

4524 Theory of operation: 

4525 - https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-573661883 

4526 - http://leoeditor.com/appendices.html#tokenorder-classes-theory-of-operation 

4527 

4528 How to: http://leoeditor.com/appendices.html#tokenorder-class-how-to 

4529 

4530 Project history: https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-574145510 

4531 """ 

4532 

4533 begin_end_stack: List[str] = [] 

4534 n_nodes = 0 # The number of nodes that have been visited. 

4535 node_index = 0 # The index into the node_stack. 

4536 node_stack: List[ast.AST] = [] # The stack of parent nodes. 

4537 

4538 #@+others 

4539 #@+node:ekr.20200103174914.1: *4* tog: Init... 

4540 #@+node:ekr.20191228184647.1: *5* tog.balance_tokens 

4541 def balance_tokens(self, tokens: List["Token"]) -> int: 

4542 """ 

4543 TOG.balance_tokens. 

4544 

4545 Insert two-way links between matching paren tokens. 

4546 """ 

4547 count, stack = 0, [] 

4548 for token in tokens: 

4549 if token.kind == 'op': 

4550 if token.value == '(': 

4551 count += 1 

4552 stack.append(token.index) 

4553 if token.value == ')': 

4554 if stack: 

4555 index = stack.pop() 

4556 tokens[index].matching_paren = token.index 

4557 tokens[token.index].matching_paren = index 

4558 else: # pragma: no cover 

4559 g.trace(f"unmatched ')' at index {token.index}") 

4560 if stack: # pragma: no cover 

4561 g.trace("unmatched '(' at {','.join(stack)}") 

4562 return count 

4563 #@+node:ekr.20191113063144.4: *5* tog.create_links 

4564 def create_links(self, tokens: List["Token"], tree: Node, file_name: str='') -> List: 

4565 """ 

4566 A generator creates two-way links between the given tokens and ast-tree. 

4567 

4568 Callers should call this generator with list(tog.create_links(...)) 

4569 

4570 The sync_tokens method creates the links and verifies that the resulting 

4571 tree traversal generates exactly the given tokens in exact order. 

4572 

4573 tokens: the list of Token instances for the input. 

4574 Created by make_tokens(). 

4575 tree: the ast tree for the input. 

4576 Created by parse_ast(). 

4577 """ 

4578 # Init all ivars. 

4579 self.file_name = file_name # For tests. 

4580 self.level = 0 # Python indentation level. 

4581 self.node = None # The node being visited. 

4582 self.tokens = tokens # The immutable list of input tokens. 

4583 self.tree = tree # The tree of ast.AST nodes. 

4584 # Traverse the tree. 

4585 self.visit(tree) 

4586 # Ensure that all tokens are patched. 

4587 self.node = tree 

4588 self.token('endmarker', '') 

4589 # Return [] for compatibility with legacy code: list(tog.create_links). 

4590 return [] 

4591 #@+node:ekr.20191229071733.1: *5* tog.init_from_file 

4592 def init_from_file(self, filename: str) -> Tuple[str, str, List["Token"], Node]: # pragma: no cover 

4593 """ 

4594 Create the tokens and ast tree for the given file. 

4595 Create links between tokens and the parse tree. 

4596 Return (contents, encoding, tokens, tree). 

4597 """ 

4598 self.level = 0 

4599 self.filename = filename 

4600 encoding, contents = read_file_with_encoding(filename) 

4601 if not contents: 

4602 return None, None, None, None 

4603 self.tokens = tokens = make_tokens(contents) 

4604 self.tree = tree = parse_ast(contents) 

4605 self.create_links(tokens, tree) 

4606 return contents, encoding, tokens, tree 

4607 #@+node:ekr.20191229071746.1: *5* tog.init_from_string 

4608 def init_from_string(self, contents: str, filename: str) -> Tuple[List["Token"], Node]: # pragma: no cover 

4609 """ 

4610 Tokenize, parse and create links in the contents string. 

4611 

4612 Return (tokens, tree). 

4613 """ 

4614 self.filename = filename 

4615 self.level = 0 

4616 self.tokens = tokens = make_tokens(contents) 

4617 self.tree = tree = parse_ast(contents) 

4618 self.create_links(tokens, tree) 

4619 return tokens, tree 

4620 #@+node:ekr.20220402052020.1: *4* tog: Syncronizers... 

4621 # The synchronizer sync tokens to nodes. 

4622 #@+node:ekr.20200110162044.1: *5* tog.find_next_significant_token 

4623 def find_next_significant_token(self) -> Optional["Token"]: 

4624 """ 

4625 Scan from *after* self.tokens[px] looking for the next significant 

4626 token. 

4627 

4628 Return the token, or None. Never change self.px. 

4629 """ 

4630 px = self.px + 1 

4631 while px < len(self.tokens): 

4632 token = self.tokens[px] 

4633 px += 1 

4634 if is_significant_token(token): 

4635 return token 

4636 # This will never happen, because endtoken is significant. 

4637 return None # pragma: no cover 

4638 #@+node:ekr.20191125120814.1: *5* tog.set_links 

4639 last_statement_node = None 

4640 

4641 def set_links(self, node: Node, token: "Token") -> None: 

4642 """Make two-way links between token and the given node.""" 

4643 # Don't bother assigning comment, comma, parens, ws and endtoken tokens. 

4644 if token.kind == 'comment': 

4645 # Append the comment to node.comment_list. 

4646 comment_list: List["Token"] = getattr(node, 'comment_list', []) 

4647 node.comment_list = comment_list + [token] 

4648 return 

4649 if token.kind in ('endmarker', 'ws'): 

4650 return 

4651 if token.kind == 'op' and token.value in ',()': 

4652 return 

4653 # *Always* remember the last statement. 

4654 statement = find_statement_node(node) 

4655 if statement: 

4656 self.last_statement_node = statement 

4657 assert not isinstance(self.last_statement_node, ast.Module) 

4658 if token.node is not None: # pragma: no cover 

4659 line_s = f"line {token.line_number}:" 

4660 raise AssignLinksError( 

4661 f" file: {self.filename}\n" 

4662 f"{line_s:>12} {token.line.strip()}\n" 

4663 f"token index: {self.px}\n" 

4664 f"token.node is not None\n" 

4665 f" token.node: {token.node.__class__.__name__}\n" 

4666 f" callers: {g.callers()}") 

4667 # Assign newlines to the previous statement node, if any. 

4668 if token.kind in ('newline', 'nl'): 

4669 # Set an *auxilliary* link for the split/join logic. 

4670 # Do *not* set token.node! 

4671 token.statement_node = self.last_statement_node 

4672 return 

4673 if is_significant_token(token): 

4674 # Link the token to the ast node. 

4675 token.node = node 

4676 # Add the token to node's token_list. 

4677 add_token_to_token_list(token, node) 

4678 #@+node:ekr.20191124083124.1: *5* tog.sync_name (aka name) 

4679 def sync_name(self, val: str) -> None: 

4680 aList = val.split('.') 

4681 if len(aList) == 1: 

4682 self.sync_token('name', val) 

4683 else: 

4684 for i, part in enumerate(aList): 

4685 self.sync_token('name', part) 

4686 if i < len(aList) - 1: 

4687 self.sync_op('.') 

4688 

4689 name = sync_name # for readability. 

4690 #@+node:ekr.20220402052102.1: *5* tog.sync_op (aka op) 

4691 def sync_op(self, val: str) -> None: 

4692 """ 

4693 Sync to the given operator. 

4694 

4695 val may be '(' or ')' *only* if the parens *will* actually exist in the 

4696 token list. 

4697 """ 

4698 self.sync_token('op', val) 

4699 

4700 op = sync_op # For readability. 

4701 #@+node:ekr.20191113063144.7: *5* tog.sync_token (aka token) 

4702 px = -1 # Index of the previously synced token. 

4703 

4704 def sync_token(self, kind: str, val: str) -> None: 

4705 """ 

4706 Sync to a token whose kind & value are given. The token need not be 

4707 significant, but it must be guaranteed to exist in the token list. 

4708 

4709 The checks in this method constitute a strong, ever-present, unit test. 

4710 

4711 Scan the tokens *after* px, looking for a token T matching (kind, val). 

4712 raise AssignLinksError if a significant token is found that doesn't match T. 

4713 Otherwise: 

4714 - Create two-way links between all assignable tokens between px and T. 

4715 - Create two-way links between T and self.node. 

4716 - Advance by updating self.px to point to T. 

4717 """ 

4718 node, tokens = self.node, self.tokens 

4719 assert isinstance(node, ast.AST), repr(node) 

4720 # g.trace( 

4721 # f"px: {self.px:2} " 

4722 # f"node: {node.__class__.__name__:<10} " 

4723 # f"kind: {kind:>10}: val: {val!r}") 

4724 # 

4725 # Step one: Look for token T. 

4726 old_px = px = self.px + 1 

4727 while px < len(self.tokens): 

4728 token = tokens[px] 

4729 if (kind, val) == (token.kind, token.value): 

4730 break # Success. 

4731 if kind == token.kind == 'number': 

4732 val = token.value 

4733 break # Benign: use the token's value, a string, instead of a number. 

4734 if is_significant_token(token): # pragma: no cover 

4735 line_s = f"line {token.line_number}:" 

4736 val = str(val) # for g.truncate. 

4737 raise AssignLinksError( 

4738 f" file: {self.filename}\n" 

4739 f"{line_s:>12} {token.line.strip()}\n" 

4740 f"Looking for: {kind}.{g.truncate(val, 40)!r}\n" 

4741 f" found: {token.kind}.{token.value!r}\n" 

4742 f"token.index: {token.index}\n") 

4743 # Skip the insignificant token. 

4744 px += 1 

4745 else: # pragma: no cover 

4746 val = str(val) # for g.truncate. 

4747 raise AssignLinksError( 

4748 f" file: {self.filename}\n" 

4749 f"Looking for: {kind}.{g.truncate(val, 40)}\n" 

4750 f" found: end of token list") 

4751 # 

4752 # Step two: Assign *secondary* links only for newline tokens. 

4753 # Ignore all other non-significant tokens. 

4754 while old_px < px: 

4755 token = tokens[old_px] 

4756 old_px += 1 

4757 if token.kind in ('comment', 'newline', 'nl'): 

4758 self.set_links(node, token) 

4759 # 

4760 # Step three: Set links in the found token. 

4761 token = tokens[px] 

4762 self.set_links(node, token) 

4763 # 

4764 # Step four: Advance. 

4765 self.px = px 

4766 

4767 token = sync_token # For readability. 

4768 #@+node:ekr.20191223052749.1: *4* tog: Traversal... 

4769 #@+node:ekr.20191113063144.3: *5* tog.enter_node 

4770 def enter_node(self, node: Node) -> None: 

4771 """Enter a node.""" 

4772 # Update the stats. 

4773 self.n_nodes += 1 

4774 # Do this first, *before* updating self.node. 

4775 node.parent = self.node 

4776 if self.node: 

4777 children: List[Node] = getattr(self.node, 'children', []) 

4778 children.append(node) 

4779 self.node.children = children 

4780 # Inject the node_index field. 

4781 assert not hasattr(node, 'node_index'), g.callers() 

4782 node.node_index = self.node_index 

4783 self.node_index += 1 

4784 # begin_visitor and end_visitor must be paired. 

4785 self.begin_end_stack.append(node.__class__.__name__) 

4786 # Push the previous node. 

4787 self.node_stack.append(self.node) 

4788 # Update self.node *last*. 

4789 self.node = node 

4790 #@+node:ekr.20200104032811.1: *5* tog.leave_node 

4791 def leave_node(self, node: Node) -> None: 

4792 """Leave a visitor.""" 

4793 # begin_visitor and end_visitor must be paired. 

4794 entry_name = self.begin_end_stack.pop() 

4795 assert entry_name == node.__class__.__name__, f"{entry_name!r} {node.__class__.__name__}" 

4796 assert self.node == node, (repr(self.node), repr(node)) 

4797 # Restore self.node. 

4798 self.node = self.node_stack.pop() 

4799 #@+node:ekr.20191113081443.1: *5* tog.visit 

4800 def visit(self, node: Node) -> None: 

4801 """Given an ast node, return a *generator* from its visitor.""" 

4802 # This saves a lot of tests. 

4803 if node is None: 

4804 return 

4805 if 0: # pragma: no cover 

4806 # Keep this trace! 

4807 cn = node.__class__.__name__ if node else ' ' 

4808 caller1, caller2 = g.callers(2).split(',') 

4809 g.trace(f"{caller1:>15} {caller2:<14} {cn}") 

4810 # More general, more convenient. 

4811 if isinstance(node, (list, tuple)): 

4812 for z in node or []: 

4813 if isinstance(z, ast.AST): 

4814 self.visit(z) 

4815 else: # pragma: no cover 

4816 # Some fields may contain ints or strings. 

4817 assert isinstance(z, (int, str)), z.__class__.__name__ 

4818 return 

4819 # We *do* want to crash if the visitor doesn't exist. 

4820 method = getattr(self, 'do_' + node.__class__.__name__) 

4821 # Don't even *think* about removing the parent/child links. 

4822 # The nearest_common_ancestor function depends upon them. 

4823 self.enter_node(node) 

4824 method(node) 

4825 self.leave_node(node) 

4826 #@+node:ekr.20191113063144.13: *4* tog: Visitors... 

4827 #@+node:ekr.20191113063144.32: *5* tog.keyword: not called! 

4828 # keyword arguments supplied to call (NULL identifier for **kwargs) 

4829 

4830 # keyword = (identifier? arg, expr value) 

4831 

4832 def do_keyword(self, node: Node) -> None: # pragma: no cover 

4833 """A keyword arg in an ast.Call.""" 

4834 # This should never be called. 

4835 # tog.hande_call_arguments calls self.visit(kwarg_arg.value) instead. 

4836 filename = getattr(self, 'filename', '<no file>') 

4837 raise AssignLinksError( 

4838 f"file: {filename}\n" 

4839 f"do_keyword should never be called\n" 

4840 f"{g.callers(8)}") 

4841 #@+node:ekr.20191113063144.14: *5* tog: Contexts 

4842 #@+node:ekr.20191113063144.28: *6* tog.arg 

4843 # arg = (identifier arg, expr? annotation) 

4844 

4845 def do_arg(self, node: Node) -> None: 

4846 """This is one argument of a list of ast.Function or ast.Lambda arguments.""" 

4847 self.name(node.arg) 

4848 annotation = getattr(node, 'annotation', None) 

4849 if annotation is not None: 

4850 self.op(':') 

4851 self.visit(node.annotation) 

4852 #@+node:ekr.20191113063144.27: *6* tog.arguments 

4853 # arguments = ( 

4854 # arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs, 

4855 # expr* kw_defaults, arg? kwarg, expr* defaults 

4856 # ) 

4857 

4858 def do_arguments(self, node: Node) -> None: 

4859 """Arguments to ast.Function or ast.Lambda, **not** ast.Call.""" 

4860 # 

4861 # No need to generate commas anywhere below. 

4862 # 

4863 # Let block. Some fields may not exist pre Python 3.8. 

4864 n_plain = len(node.args) - len(node.defaults) 

4865 posonlyargs = getattr(node, 'posonlyargs', []) 

4866 vararg = getattr(node, 'vararg', None) 

4867 kwonlyargs = getattr(node, 'kwonlyargs', []) 

4868 kw_defaults = getattr(node, 'kw_defaults', []) 

4869 kwarg = getattr(node, 'kwarg', None) 

4870 # 1. Sync the position-only args. 

4871 if posonlyargs: 

4872 for n, z in enumerate(posonlyargs): 

4873 # g.trace('pos-only', ast.dump(z)) 

4874 self.visit(z) 

4875 self.op('/') 

4876 # 2. Sync all args. 

4877 for i, z in enumerate(node.args): 

4878 self.visit(z) 

4879 if i >= n_plain: 

4880 self.op('=') 

4881 self.visit(node.defaults[i - n_plain]) 

4882 # 3. Sync the vararg. 

4883 if vararg: 

4884 self.op('*') 

4885 self.visit(vararg) 

4886 # 4. Sync the keyword-only args. 

4887 if kwonlyargs: 

4888 if not vararg: 

4889 self.op('*') 

4890 for n, z in enumerate(kwonlyargs): 

4891 self.visit(z) 

4892 val = kw_defaults[n] 

4893 if val is not None: 

4894 self.op('=') 

4895 self.visit(val) 

4896 # 5. Sync the kwarg. 

4897 if kwarg: 

4898 self.op('**') 

4899 self.visit(kwarg) 

4900 #@+node:ekr.20191113063144.15: *6* tog.AsyncFunctionDef 

4901 # AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list, 

4902 # expr? returns) 

4903 

4904 def do_AsyncFunctionDef(self, node: Node) -> None: 

4905 

4906 if node.decorator_list: 

4907 for z in node.decorator_list: 

4908 # '@%s\n' 

4909 self.op('@') 

4910 self.visit(z) 

4911 # 'asynch def (%s): -> %s\n' 

4912 # 'asynch def %s(%s):\n' 

4913 async_token_type = 'async' if has_async_tokens else 'name' 

4914 self.token(async_token_type, 'async') 

4915 self.name('def') 

4916 self.name(node.name) # A string 

4917 self.op('(') 

4918 self.visit(node.args) 

4919 self.op(')') 

4920 returns = getattr(node, 'returns', None) 

4921 if returns is not None: 

4922 self.op('->') 

4923 self.visit(node.returns) 

4924 self.op(':') 

4925 self.level += 1 

4926 self.visit(node.body) 

4927 self.level -= 1 

4928 #@+node:ekr.20191113063144.16: *6* tog.ClassDef 

4929 def do_ClassDef(self, node: Node) -> None: 

4930 

4931 for z in node.decorator_list or []: 

4932 # @{z}\n 

4933 self.op('@') 

4934 self.visit(z) 

4935 # class name(bases):\n 

4936 self.name('class') 

4937 self.name(node.name) # A string. 

4938 if node.bases: 

4939 self.op('(') 

4940 self.visit(node.bases) 

4941 self.op(')') 

4942 self.op(':') 

4943 # Body... 

4944 self.level += 1 

4945 self.visit(node.body) 

4946 self.level -= 1 

4947 #@+node:ekr.20191113063144.17: *6* tog.FunctionDef 

4948 # FunctionDef( 

4949 # identifier name, arguments args, 

4950 # stmt* body, 

4951 # expr* decorator_list, 

4952 # expr? returns, 

4953 # string? type_comment) 

4954 

4955 def do_FunctionDef(self, node: Node) -> None: 

4956 

4957 # Guards... 

4958 returns = getattr(node, 'returns', None) 

4959 # Decorators... 

4960 # @{z}\n 

4961 for z in node.decorator_list or []: 

4962 self.op('@') 

4963 self.visit(z) 

4964 # Signature... 

4965 # def name(args): -> returns\n 

4966 # def name(args):\n 

4967 self.name('def') 

4968 self.name(node.name) # A string. 

4969 self.op('(') 

4970 self.visit(node.args) 

4971 self.op(')') 

4972 if returns is not None: 

4973 self.op('->') 

4974 self.visit(node.returns) 

4975 self.op(':') 

4976 # Body... 

4977 self.level += 1 

4978 self.visit(node.body) 

4979 self.level -= 1 

4980 #@+node:ekr.20191113063144.18: *6* tog.Interactive 

4981 def do_Interactive(self, node: Node) -> None: # pragma: no cover 

4982 

4983 self.visit(node.body) 

4984 #@+node:ekr.20191113063144.20: *6* tog.Lambda 

4985 def do_Lambda(self, node: Node) -> None: 

4986 

4987 self.name('lambda') 

4988 self.visit(node.args) 

4989 self.op(':') 

4990 self.visit(node.body) 

4991 #@+node:ekr.20191113063144.19: *6* tog.Module 

4992 def do_Module(self, node: Node) -> None: 

4993 

4994 # Encoding is a non-syncing statement. 

4995 self.visit(node.body) 

4996 #@+node:ekr.20191113063144.21: *5* tog: Expressions 

4997 #@+node:ekr.20191113063144.22: *6* tog.Expr 

4998 def do_Expr(self, node: Node) -> None: 

4999 """An outer expression.""" 

5000 # No need to put parentheses. 

5001 self.visit(node.value) 

5002 #@+node:ekr.20191113063144.23: *6* tog.Expression 

5003 def do_Expression(self, node: Node) -> None: # pragma: no cover 

5004 """An inner expression.""" 

5005 # No need to put parentheses. 

5006 self.visit(node.body) 

5007 #@+node:ekr.20191113063144.24: *6* tog.GeneratorExp 

5008 def do_GeneratorExp(self, node: Node) -> None: 

5009 

5010 # '<gen %s for %s>' % (elt, ','.join(gens)) 

5011 # No need to put parentheses or commas. 

5012 self.visit(node.elt) 

5013 self.visit(node.generators) 

5014 #@+node:ekr.20210321171703.1: *6* tog.NamedExpr 

5015 # NamedExpr(expr target, expr value) 

5016 

5017 def do_NamedExpr(self, node: Node) -> None: # Python 3.8+ 

5018 

5019 self.visit(node.target) 

5020 self.op(':=') 

5021 self.visit(node.value) 

5022 #@+node:ekr.20191113063144.26: *5* tog: Operands 

5023 #@+node:ekr.20191113063144.29: *6* tog.Attribute 

5024 # Attribute(expr value, identifier attr, expr_context ctx) 

5025 

5026 def do_Attribute(self, node: Node) -> None: 

5027 

5028 self.visit(node.value) 

5029 self.op('.') 

5030 self.name(node.attr) # A string. 

5031 #@+node:ekr.20191113063144.30: *6* tog.Bytes 

5032 def do_Bytes(self, node: Node) -> None: 

5033 

5034 """ 

5035 It's invalid to mix bytes and non-bytes literals, so just 

5036 advancing to the next 'string' token suffices. 

5037 """ 

5038 token = self.find_next_significant_token() 

5039 self.token('string', token.value) 

5040 #@+node:ekr.20191113063144.33: *6* tog.comprehension 

5041 # comprehension = (expr target, expr iter, expr* ifs, int is_async) 

5042 

5043 def do_comprehension(self, node: Node) -> None: 

5044 

5045 # No need to put parentheses. 

5046 self.name('for') # #1858. 

5047 self.visit(node.target) # A name 

5048 self.name('in') 

5049 self.visit(node.iter) 

5050 for z in node.ifs or []: 

5051 self.name('if') 

5052 self.visit(z) 

5053 #@+node:ekr.20191113063144.34: *6* tog.Constant 

5054 def do_Constant(self, node: Node) -> None: # pragma: no cover 

5055 """ 

5056 https://greentreesnakes.readthedocs.io/en/latest/nodes.html 

5057 

5058 A constant. The value attribute holds the Python object it represents. 

5059 This can be simple types such as a number, string or None, but also 

5060 immutable container types (tuples and frozensets) if all of their 

5061 elements are constant. 

5062 """ 

5063 # Support Python 3.8. 

5064 if node.value is None or isinstance(node.value, bool): 

5065 # Weird: return a name! 

5066 self.token('name', repr(node.value)) 

5067 elif node.value == Ellipsis: 

5068 self.op('...') 

5069 elif isinstance(node.value, str): 

5070 self.do_Str(node) 

5071 elif isinstance(node.value, (int, float)): 

5072 self.token('number', repr(node.value)) 

5073 elif isinstance(node.value, bytes): 

5074 self.do_Bytes(node) 

5075 elif isinstance(node.value, tuple): 

5076 self.do_Tuple(node) 

5077 elif isinstance(node.value, frozenset): 

5078 self.do_Set(node) 

5079 else: 

5080 # Unknown type. 

5081 g.trace('----- Oops -----', repr(node.value), g.callers()) 

5082 #@+node:ekr.20191113063144.35: *6* tog.Dict 

5083 # Dict(expr* keys, expr* values) 

5084 

5085 def do_Dict(self, node: Node) -> None: 

5086 

5087 assert len(node.keys) == len(node.values) 

5088 self.op('{') 

5089 # No need to put commas. 

5090 for i, key in enumerate(node.keys): 

5091 key, value = node.keys[i], node.values[i] 

5092 self.visit(key) # a Str node. 

5093 self.op(':') 

5094 if value is not None: 

5095 self.visit(value) 

5096 self.op('}') 

5097 #@+node:ekr.20191113063144.36: *6* tog.DictComp 

5098 # DictComp(expr key, expr value, comprehension* generators) 

5099 

5100 # d2 = {val: key for key, val in d} 

5101 

5102 def do_DictComp(self, node: Node) -> None: 

5103 

5104 self.token('op', '{') 

5105 self.visit(node.key) 

5106 self.op(':') 

5107 self.visit(node.value) 

5108 for z in node.generators or []: 

5109 self.visit(z) 

5110 self.token('op', '}') 

5111 #@+node:ekr.20191113063144.37: *6* tog.Ellipsis 

5112 def do_Ellipsis(self, node: Node) -> None: # pragma: no cover (Does not exist for python 3.8+) 

5113 

5114 self.op('...') 

5115 #@+node:ekr.20191113063144.38: *6* tog.ExtSlice 

5116 # https://docs.python.org/3/reference/expressions.html#slicings 

5117 

5118 # ExtSlice(slice* dims) 

5119 

5120 def do_ExtSlice(self, node: Node) -> None: # pragma: no cover (deprecated) 

5121 

5122 # ','.join(node.dims) 

5123 for i, z in enumerate(node.dims): 

5124 self.visit(z) 

5125 if i < len(node.dims) - 1: 

5126 self.op(',') 

5127 #@+node:ekr.20191113063144.40: *6* tog.Index 

5128 def do_Index(self, node: Node) -> None: # pragma: no cover (deprecated) 

5129 

5130 self.visit(node.value) 

5131 #@+node:ekr.20191113063144.39: *6* tog.FormattedValue: not called! 

5132 # FormattedValue(expr value, int? conversion, expr? format_spec) 

5133 

5134 def do_FormattedValue(self, node: Node) -> None: # pragma: no cover 

5135 """ 

5136 This node represents the *components* of a *single* f-string. 

5137 

5138 Happily, JoinedStr nodes *also* represent *all* f-strings, 

5139 so the TOG should *never visit this node! 

5140 """ 

5141 filename = getattr(self, 'filename', '<no file>') 

5142 raise AssignLinksError( 

5143 f"file: {filename}\n" 

5144 f"do_FormattedValue should never be called") 

5145 

5146 # This code has no chance of being useful... 

5147 

5148 # conv = node.conversion 

5149 # spec = node.format_spec 

5150 # self.visit(node.value) 

5151 # if conv is not None: 

5152 # self.token('number', conv) 

5153 # if spec is not None: 

5154 # self.visit(node.format_spec) 

5155 #@+node:ekr.20191113063144.41: *6* tog.JoinedStr & helpers 

5156 # JoinedStr(expr* values) 

5157 

5158 def do_JoinedStr(self, node: Node) -> None: 

5159 """ 

5160 JoinedStr nodes represent at least one f-string and all other strings 

5161 concatentated to it. 

5162 

5163 Analyzing JoinedStr.values would be extremely tricky, for reasons that 

5164 need not be explained here. 

5165 

5166 Instead, we get the tokens *from the token list itself*! 

5167 """ 

5168 for z in self.get_concatenated_string_tokens(): 

5169 self.token(z.kind, z.value) 

5170 #@+node:ekr.20191113063144.42: *6* tog.List 

5171 def do_List(self, node: Node) -> None: 

5172 

5173 # No need to put commas. 

5174 self.op('[') 

5175 self.visit(node.elts) 

5176 self.op(']') 

5177 #@+node:ekr.20191113063144.43: *6* tog.ListComp 

5178 # ListComp(expr elt, comprehension* generators) 

5179 

5180 def do_ListComp(self, node: Node) -> None: 

5181 

5182 self.op('[') 

5183 self.visit(node.elt) 

5184 for z in node.generators: 

5185 self.visit(z) 

5186 self.op(']') 

5187 #@+node:ekr.20191113063144.44: *6* tog.Name & NameConstant 

5188 def do_Name(self, node: Node) -> None: 

5189 

5190 self.name(node.id) 

5191 

5192 def do_NameConstant(self, node: Node) -> None: # pragma: no cover (Does not exist in Python 3.8+) 

5193 

5194 self.name(repr(node.value)) 

5195 

5196 #@+node:ekr.20191113063144.45: *6* tog.Num 

5197 def do_Num(self, node: Node) -> None: # pragma: no cover (Does not exist in Python 3.8+) 

5198 

5199 self.token('number', node.n) 

5200 #@+node:ekr.20191113063144.47: *6* tog.Set 

5201 # Set(expr* elts) 

5202 

5203 def do_Set(self, node: Node) -> None: 

5204 

5205 self.op('{') 

5206 self.visit(node.elts) 

5207 self.op('}') 

5208 #@+node:ekr.20191113063144.48: *6* tog.SetComp 

5209 # SetComp(expr elt, comprehension* generators) 

5210 

5211 def do_SetComp(self, node: Node) -> None: 

5212 

5213 self.op('{') 

5214 self.visit(node.elt) 

5215 for z in node.generators or []: 

5216 self.visit(z) 

5217 self.op('}') 

5218 #@+node:ekr.20191113063144.49: *6* tog.Slice 

5219 # slice = Slice(expr? lower, expr? upper, expr? step) 

5220 

5221 def do_Slice(self, node: Node) -> None: 

5222 

5223 lower = getattr(node, 'lower', None) 

5224 upper = getattr(node, 'upper', None) 

5225 step = getattr(node, 'step', None) 

5226 if lower is not None: 

5227 self.visit(lower) 

5228 # Always put the colon between upper and lower. 

5229 self.op(':') 

5230 if upper is not None: 

5231 self.visit(upper) 

5232 # Put the second colon if it exists in the token list. 

5233 if step is None: 

5234 token = self.find_next_significant_token() 

5235 if token and token.value == ':': 

5236 self.op(':') 

5237 else: 

5238 self.op(':') 

5239 self.visit(step) 

5240 #@+node:ekr.20191113063144.50: *6* tog.Str & helper 

5241 def do_Str(self, node: Node) -> None: 

5242 """This node represents a string constant.""" 

5243 # This loop is necessary to handle string concatenation. 

5244 for z in self.get_concatenated_string_tokens(): 

5245 self.token(z.kind, z.value) 

5246 #@+node:ekr.20200111083914.1: *7* tog.get_concatenated_tokens 

5247 def get_concatenated_string_tokens(self) -> List["Token"]: 

5248 """ 

5249 Return the next 'string' token and all 'string' tokens concatenated to 

5250 it. *Never* update self.px here. 

5251 """ 

5252 trace = False 

5253 tag = 'tog.get_concatenated_string_tokens' 

5254 i = self.px 

5255 # First, find the next significant token. It should be a string. 

5256 i, token = i + 1, None 

5257 while i < len(self.tokens): 

5258 token = self.tokens[i] 

5259 i += 1 

5260 if token.kind == 'string': 

5261 # Rescan the string. 

5262 i -= 1 

5263 break 

5264 # An error. 

5265 if is_significant_token(token): # pragma: no cover 

5266 break 

5267 # Raise an error if we didn't find the expected 'string' token. 

5268 if not token or token.kind != 'string': # pragma: no cover 

5269 if not token: 

5270 token = self.tokens[-1] 

5271 filename = getattr(self, 'filename', '<no filename>') 

5272 raise AssignLinksError( 

5273 f"\n" 

5274 f"{tag}...\n" 

5275 f"file: {filename}\n" 

5276 f"line: {token.line_number}\n" 

5277 f" i: {i}\n" 

5278 f"expected 'string' token, got {token!s}") 

5279 # Accumulate string tokens. 

5280 assert self.tokens[i].kind == 'string' 

5281 results = [] 

5282 while i < len(self.tokens): 

5283 token = self.tokens[i] 

5284 i += 1 

5285 if token.kind == 'string': 

5286 results.append(token) 

5287 elif token.kind == 'op' or is_significant_token(token): 

5288 # Any significant token *or* any op will halt string concatenation. 

5289 break 

5290 # 'ws', 'nl', 'newline', 'comment', 'indent', 'dedent', etc. 

5291 # The (significant) 'endmarker' token ensures we will have result. 

5292 assert results 

5293 if trace: # pragma: no cover 

5294 g.printObj(results, tag=f"{tag}: Results") 

5295 return results 

5296 #@+node:ekr.20191113063144.51: *6* tog.Subscript 

5297 # Subscript(expr value, slice slice, expr_context ctx) 

5298 

5299 def do_Subscript(self, node: Node) -> None: 

5300 

5301 self.visit(node.value) 

5302 self.op('[') 

5303 self.visit(node.slice) 

5304 self.op(']') 

5305 #@+node:ekr.20191113063144.52: *6* tog.Tuple 

5306 # Tuple(expr* elts, expr_context ctx) 

5307 

5308 def do_Tuple(self, node: Node) -> None: 

5309 

5310 # Do not call op for parens or commas here. 

5311 # They do not necessarily exist in the token list! 

5312 self.visit(node.elts) 

5313 #@+node:ekr.20191113063144.53: *5* tog: Operators 

5314 #@+node:ekr.20191113063144.55: *6* tog.BinOp 

5315 def do_BinOp(self, node: Node) -> None: 

5316 

5317 op_name_ = op_name(node.op) 

5318 self.visit(node.left) 

5319 self.op(op_name_) 

5320 self.visit(node.right) 

5321 #@+node:ekr.20191113063144.56: *6* tog.BoolOp 

5322 # BoolOp(boolop op, expr* values) 

5323 

5324 def do_BoolOp(self, node: Node) -> None: 

5325 

5326 # op.join(node.values) 

5327 op_name_ = op_name(node.op) 

5328 for i, z in enumerate(node.values): 

5329 self.visit(z) 

5330 if i < len(node.values) - 1: 

5331 self.name(op_name_) 

5332 #@+node:ekr.20191113063144.57: *6* tog.Compare 

5333 # Compare(expr left, cmpop* ops, expr* comparators) 

5334 

5335 def do_Compare(self, node: Node) -> None: 

5336 

5337 assert len(node.ops) == len(node.comparators) 

5338 self.visit(node.left) 

5339 for i, z in enumerate(node.ops): 

5340 op_name_ = op_name(node.ops[i]) 

5341 if op_name_ in ('not in', 'is not'): 

5342 for z in op_name_.split(' '): 

5343 self.name(z) 

5344 elif op_name_.isalpha(): 

5345 self.name(op_name_) 

5346 else: 

5347 self.op(op_name_) 

5348 self.visit(node.comparators[i]) 

5349 #@+node:ekr.20191113063144.58: *6* tog.UnaryOp 

5350 def do_UnaryOp(self, node: Node) -> None: 

5351 

5352 op_name_ = op_name(node.op) 

5353 if op_name_.isalpha(): 

5354 self.name(op_name_) 

5355 else: 

5356 self.op(op_name_) 

5357 self.visit(node.operand) 

5358 #@+node:ekr.20191113063144.59: *6* tog.IfExp (ternary operator) 

5359 # IfExp(expr test, expr body, expr orelse) 

5360 

5361 def do_IfExp(self, node: Node) -> None: 

5362 

5363 #'%s if %s else %s' 

5364 self.visit(node.body) 

5365 self.name('if') 

5366 self.visit(node.test) 

5367 self.name('else') 

5368 self.visit(node.orelse) 

5369 #@+node:ekr.20191113063144.60: *5* tog: Statements 

5370 #@+node:ekr.20191113063144.83: *6* tog.Starred 

5371 # Starred(expr value, expr_context ctx) 

5372 

5373 def do_Starred(self, node: Node) -> None: 

5374 """A starred argument to an ast.Call""" 

5375 self.op('*') 

5376 self.visit(node.value) 

5377 #@+node:ekr.20191113063144.61: *6* tog.AnnAssign 

5378 # AnnAssign(expr target, expr annotation, expr? value, int simple) 

5379 

5380 def do_AnnAssign(self, node: Node) -> None: 

5381 

5382 # {node.target}:{node.annotation}={node.value}\n' 

5383 self.visit(node.target) 

5384 self.op(':') 

5385 self.visit(node.annotation) 

5386 if node.value is not None: # #1851 

5387 self.op('=') 

5388 self.visit(node.value) 

5389 #@+node:ekr.20191113063144.62: *6* tog.Assert 

5390 # Assert(expr test, expr? msg) 

5391 

5392 def do_Assert(self, node: Node) -> None: 

5393 

5394 # Guards... 

5395 msg = getattr(node, 'msg', None) 

5396 # No need to put parentheses or commas. 

5397 self.name('assert') 

5398 self.visit(node.test) 

5399 if msg is not None: 

5400 self.visit(node.msg) 

5401 #@+node:ekr.20191113063144.63: *6* tog.Assign 

5402 def do_Assign(self, node: Node) -> None: 

5403 

5404 for z in node.targets: 

5405 self.visit(z) 

5406 self.op('=') 

5407 self.visit(node.value) 

5408 #@+node:ekr.20191113063144.64: *6* tog.AsyncFor 

5409 def do_AsyncFor(self, node: Node) -> None: 

5410 

5411 # The def line... 

5412 # Py 3.8 changes the kind of token. 

5413 async_token_type = 'async' if has_async_tokens else 'name' 

5414 self.token(async_token_type, 'async') 

5415 self.name('for') 

5416 self.visit(node.target) 

5417 self.name('in') 

5418 self.visit(node.iter) 

5419 self.op(':') 

5420 # Body... 

5421 self.level += 1 

5422 self.visit(node.body) 

5423 # Else clause... 

5424 if node.orelse: 

5425 self.name('else') 

5426 self.op(':') 

5427 self.visit(node.orelse) 

5428 self.level -= 1 

5429 #@+node:ekr.20191113063144.65: *6* tog.AsyncWith 

5430 def do_AsyncWith(self, node: Node) -> None: 

5431 

5432 async_token_type = 'async' if has_async_tokens else 'name' 

5433 self.token(async_token_type, 'async') 

5434 self.do_With(node) 

5435 #@+node:ekr.20191113063144.66: *6* tog.AugAssign 

5436 # AugAssign(expr target, operator op, expr value) 

5437 

5438 def do_AugAssign(self, node: Node) -> None: 

5439 

5440 # %s%s=%s\n' 

5441 op_name_ = op_name(node.op) 

5442 self.visit(node.target) 

5443 self.op(op_name_ + '=') 

5444 self.visit(node.value) 

5445 #@+node:ekr.20191113063144.67: *6* tog.Await 

5446 # Await(expr value) 

5447 

5448 def do_Await(self, node: Node) -> None: 

5449 

5450 #'await %s\n' 

5451 async_token_type = 'await' if has_async_tokens else 'name' 

5452 self.token(async_token_type, 'await') 

5453 self.visit(node.value) 

5454 #@+node:ekr.20191113063144.68: *6* tog.Break 

5455 def do_Break(self, node: Node) -> None: 

5456 

5457 self.name('break') 

5458 #@+node:ekr.20191113063144.31: *6* tog.Call & helpers 

5459 # Call(expr func, expr* args, keyword* keywords) 

5460 

5461 # Python 3 ast.Call nodes do not have 'starargs' or 'kwargs' fields. 

5462 

5463 def do_Call(self, node: Node) -> None: 

5464 

5465 # The calls to op(')') and op('(') do nothing by default. 

5466 # Subclasses might handle them in an overridden tog.set_links. 

5467 self.visit(node.func) 

5468 self.op('(') 

5469 # No need to generate any commas. 

5470 self.handle_call_arguments(node) 

5471 self.op(')') 

5472 #@+node:ekr.20191204114930.1: *7* tog.arg_helper 

5473 def arg_helper(self, node: Union[Node, str]) -> None: 

5474 """ 

5475 Yield the node, with a special case for strings. 

5476 """ 

5477 if isinstance(node, str): 

5478 self.token('name', node) 

5479 else: 

5480 self.visit(node) 

5481 #@+node:ekr.20191204105506.1: *7* tog.handle_call_arguments 

5482 def handle_call_arguments(self, node: Node) -> None: 

5483 """ 

5484 Generate arguments in the correct order. 

5485 

5486 Call(expr func, expr* args, keyword* keywords) 

5487 

5488 https://docs.python.org/3/reference/expressions.html#calls 

5489 

5490 Warning: This code will fail on Python 3.8 only for calls 

5491 containing kwargs in unexpected places. 

5492 """ 

5493 # *args: in node.args[]: Starred(value=Name(id='args')) 

5494 # *[a, 3]: in node.args[]: Starred(value=List(elts=[Name(id='a'), Num(n=3)]) 

5495 # **kwargs: in node.keywords[]: keyword(arg=None, value=Name(id='kwargs')) 

5496 # 

5497 # Scan args for *name or *List 

5498 args = node.args or [] 

5499 keywords = node.keywords or [] 

5500 

5501 def get_pos(obj: Any) -> Tuple[int, int, Any]: 

5502 line1 = getattr(obj, 'lineno', None) 

5503 col1 = getattr(obj, 'col_offset', None) 

5504 return line1, col1, obj 

5505 

5506 def sort_key(aTuple: Tuple) -> int: 

5507 line, col, obj = aTuple 

5508 return line * 1000 + col 

5509 

5510 if 0: # pragma: no cover 

5511 g.printObj([ast.dump(z) for z in args], tag='args') 

5512 g.printObj([ast.dump(z) for z in keywords], tag='keywords') 

5513 

5514 if py_version >= (3, 9): 

5515 places = [get_pos(z) for z in args + keywords] 

5516 places.sort(key=sort_key) 

5517 ordered_args = [z[2] for z in places] 

5518 for z in ordered_args: 

5519 if isinstance(z, ast.Starred): 

5520 self.op('*') 

5521 self.visit(z.value) 

5522 elif isinstance(z, ast.keyword): 

5523 if getattr(z, 'arg', None) is None: 

5524 self.op('**') 

5525 self.arg_helper(z.value) 

5526 else: 

5527 self.arg_helper(z.arg) 

5528 self.op('=') 

5529 self.arg_helper(z.value) 

5530 else: 

5531 self.arg_helper(z) 

5532 else: # pragma: no cover 

5533 # 

5534 # Legacy code: May fail for Python 3.8 

5535 # 

5536 # Scan args for *arg and *[...] 

5537 kwarg_arg = star_arg = None 

5538 for z in args: 

5539 if isinstance(z, ast.Starred): 

5540 if isinstance(z.value, ast.Name): # *Name. 

5541 star_arg = z 

5542 args.remove(z) 

5543 break 

5544 elif isinstance(z.value, (ast.List, ast.Tuple)): # *[...] 

5545 # star_list = z 

5546 break 

5547 raise AttributeError(f"Invalid * expression: {ast.dump(z)}") # pragma: no cover 

5548 # Scan keywords for **name. 

5549 for z in keywords: 

5550 if hasattr(z, 'arg') and z.arg is None: 

5551 kwarg_arg = z 

5552 keywords.remove(z) 

5553 break 

5554 # Sync the plain arguments. 

5555 for z in args: 

5556 self.arg_helper(z) 

5557 # Sync the keyword args. 

5558 for z in keywords: 

5559 self.arg_helper(z.arg) 

5560 self.op('=') 

5561 self.arg_helper(z.value) 

5562 # Sync the * arg. 

5563 if star_arg: 

5564 self.arg_helper(star_arg) 

5565 # Sync the ** kwarg. 

5566 if kwarg_arg: 

5567 self.op('**') 

5568 self.visit(kwarg_arg.value) 

5569 #@+node:ekr.20191113063144.69: *6* tog.Continue 

5570 def do_Continue(self, node: Node) -> None: 

5571 

5572 self.name('continue') 

5573 #@+node:ekr.20191113063144.70: *6* tog.Delete 

5574 def do_Delete(self, node: Node) -> None: 

5575 

5576 # No need to put commas. 

5577 self.name('del') 

5578 self.visit(node.targets) 

5579 #@+node:ekr.20191113063144.71: *6* tog.ExceptHandler 

5580 def do_ExceptHandler(self, node: Node) -> None: 

5581 

5582 # Except line... 

5583 self.name('except') 

5584 if getattr(node, 'type', None): 

5585 self.visit(node.type) 

5586 if getattr(node, 'name', None): 

5587 self.name('as') 

5588 self.name(node.name) 

5589 self.op(':') 

5590 # Body... 

5591 self.level += 1 

5592 self.visit(node.body) 

5593 self.level -= 1 

5594 #@+node:ekr.20191113063144.73: *6* tog.For 

5595 def do_For(self, node: Node) -> None: 

5596 

5597 # The def line... 

5598 self.name('for') 

5599 self.visit(node.target) 

5600 self.name('in') 

5601 self.visit(node.iter) 

5602 self.op(':') 

5603 # Body... 

5604 self.level += 1 

5605 self.visit(node.body) 

5606 # Else clause... 

5607 if node.orelse: 

5608 self.name('else') 

5609 self.op(':') 

5610 self.visit(node.orelse) 

5611 self.level -= 1 

5612 #@+node:ekr.20191113063144.74: *6* tog.Global 

5613 # Global(identifier* names) 

5614 

5615 def do_Global(self, node: Node) -> None: 

5616 

5617 self.name('global') 

5618 for z in node.names: 

5619 self.name(z) 

5620 #@+node:ekr.20191113063144.75: *6* tog.If & helpers 

5621 # If(expr test, stmt* body, stmt* orelse) 

5622 

5623 def do_If(self, node: Node) -> None: 

5624 #@+<< do_If docstring >> 

5625 #@+node:ekr.20191122222412.1: *7* << do_If docstring >> 

5626 """ 

5627 The parse trees for the following are identical! 

5628 

5629 if 1: if 1: 

5630 pass pass 

5631 else: elif 2: 

5632 if 2: pass 

5633 pass 

5634 

5635 So there is *no* way for the 'if' visitor to disambiguate the above two 

5636 cases from the parse tree alone. 

5637 

5638 Instead, we scan the tokens list for the next 'if', 'else' or 'elif' token. 

5639 """ 

5640 #@-<< do_If docstring >> 

5641 # Use the next significant token to distinguish between 'if' and 'elif'. 

5642 token = self.find_next_significant_token() 

5643 self.name(token.value) 

5644 self.visit(node.test) 

5645 self.op(':') 

5646 # 

5647 # Body... 

5648 self.level += 1 

5649 self.visit(node.body) 

5650 self.level -= 1 

5651 # 

5652 # Else and elif clauses... 

5653 if node.orelse: 

5654 self.level += 1 

5655 token = self.find_next_significant_token() 

5656 if token.value == 'else': 

5657 self.name('else') 

5658 self.op(':') 

5659 self.visit(node.orelse) 

5660 else: 

5661 self.visit(node.orelse) 

5662 self.level -= 1 

5663 #@+node:ekr.20191113063144.76: *6* tog.Import & helper 

5664 def do_Import(self, node: Node) -> None: 

5665 

5666 self.name('import') 

5667 for alias in node.names: 

5668 self.name(alias.name) 

5669 if alias.asname: 

5670 self.name('as') 

5671 self.name(alias.asname) 

5672 #@+node:ekr.20191113063144.77: *6* tog.ImportFrom 

5673 # ImportFrom(identifier? module, alias* names, int? level) 

5674 

5675 def do_ImportFrom(self, node: Node) -> None: 

5676 

5677 self.name('from') 

5678 for i in range(node.level): 

5679 self.op('.') 

5680 if node.module: 

5681 self.name(node.module) 

5682 self.name('import') 

5683 # No need to put commas. 

5684 for alias in node.names: 

5685 if alias.name == '*': # #1851. 

5686 self.op('*') 

5687 else: 

5688 self.name(alias.name) 

5689 if alias.asname: 

5690 self.name('as') 

5691 self.name(alias.asname) 

5692 #@+node:ekr.20220401034726.1: *6* tog.Match* (Python 3.10+) 

5693 # Match(expr subject, match_case* cases) 

5694 

5695 # match_case = (pattern pattern, expr? guard, stmt* body) 

5696 

5697 # Full syntax diagram: # https://peps.python.org/pep-0634/#appendix-a 

5698 

5699 def do_Match(self, node: Node) -> None: 

5700 

5701 cases = getattr(node, 'cases', []) 

5702 self.name('match') 

5703 self.visit(node.subject) 

5704 self.op(':') 

5705 for case in cases: 

5706 self.visit(case) 

5707 #@+node:ekr.20220401034726.2: *7* tog.match_case 

5708 # match_case = (pattern pattern, expr? guard, stmt* body) 

5709 

5710 def do_match_case(self, node: Node) -> None: 

5711 

5712 guard = getattr(node, 'guard', None) 

5713 body = getattr(node, 'body', []) 

5714 self.name('case') 

5715 self.visit(node.pattern) 

5716 if guard: 

5717 self.name('if') 

5718 self.visit(guard) 

5719 self.op(':') 

5720 for statement in body: 

5721 self.visit(statement) 

5722 #@+node:ekr.20220401034726.3: *7* tog.MatchAs 

5723 # MatchAs(pattern? pattern, identifier? name) 

5724 

5725 def do_MatchAs(self, node: Node) -> None: 

5726 pattern = getattr(node, 'pattern', None) 

5727 name = getattr(node, 'name', None) 

5728 if pattern and name: 

5729 self.visit(pattern) 

5730 self.name('as') 

5731 self.name(name) 

5732 elif pattern: 

5733 self.visit(pattern) # pragma: no cover 

5734 else: 

5735 self.name(name or '_') 

5736 #@+node:ekr.20220401034726.4: *7* tog.MatchClass 

5737 # MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns) 

5738 

5739 def do_MatchClass(self, node: Node) -> None: 

5740 

5741 patterns = getattr(node, 'patterns', []) 

5742 kwd_attrs = getattr(node, 'kwd_attrs', []) 

5743 kwd_patterns = getattr(node, 'kwd_patterns', []) 

5744 self.visit(node.cls) 

5745 self.op('(') 

5746 for pattern in patterns: 

5747 self.visit(pattern) 

5748 for i, kwd_attr in enumerate(kwd_attrs): 

5749 self.name(kwd_attr) # a String. 

5750 self.op('=') 

5751 self.visit(kwd_patterns[i]) 

5752 self.op(')') 

5753 #@+node:ekr.20220401034726.5: *7* tog.MatchMapping 

5754 # MatchMapping(expr* keys, pattern* patterns, identifier? rest) 

5755 

5756 def do_MatchMapping(self, node: Node) -> None: 

5757 keys = getattr(node, 'keys', []) 

5758 patterns = getattr(node, 'patterns', []) 

5759 rest = getattr(node, 'rest', None) 

5760 self.op('{') 

5761 for i, key in enumerate(keys): 

5762 self.visit(key) 

5763 self.op(':') 

5764 self.visit(patterns[i]) 

5765 if rest: 

5766 self.op('**') 

5767 self.name(rest) # A string. 

5768 self.op('}') 

5769 #@+node:ekr.20220401034726.6: *7* tog.MatchOr 

5770 # MatchOr(pattern* patterns) 

5771 

5772 def do_MatchOr(self, node: Node) -> None: 

5773 patterns = getattr(node, 'patterns', []) 

5774 for i, pattern in enumerate(patterns): 

5775 if i > 0: 

5776 self.op('|') 

5777 self.visit(pattern) 

5778 #@+node:ekr.20220401034726.7: *7* tog.MatchSequence 

5779 # MatchSequence(pattern* patterns) 

5780 

5781 def do_MatchSequence(self, node: Node) -> None: 

5782 patterns = getattr(node, 'patterns', []) 

5783 # Scan for the next '(' or '[' token, skipping the 'case' token. 

5784 token = None 

5785 for token in self.tokens[self.px + 1 :]: 

5786 if token.kind == 'op' and token.value in '([': 

5787 break 

5788 if is_significant_token(token): 

5789 # An implicit tuple: there is no '(' or '[' token. 

5790 token = None 

5791 break 

5792 else: 

5793 raise AssignLinksError('Ill-formed tuple') # pragma: no cover 

5794 if token: 

5795 self.op(token.value) 

5796 for i, pattern in enumerate(patterns): 

5797 self.visit(pattern) 

5798 if token: 

5799 self.op(']' if token.value == '[' else ')') 

5800 #@+node:ekr.20220401034726.8: *7* tog.MatchSingleton 

5801 # MatchSingleton(constant value) 

5802 

5803 def do_MatchSingleton(self, node: Node) -> None: 

5804 """Match True, False or None.""" 

5805 # g.trace(repr(node.value)) 

5806 self.token('name', repr(node.value)) 

5807 #@+node:ekr.20220401034726.9: *7* tog.MatchStar 

5808 # MatchStar(identifier? name) 

5809 

5810 def do_MatchStar(self, node: Node) -> None: 

5811 name = getattr(node, 'name', None) 

5812 self.op('*') 

5813 if name: 

5814 self.name(name) 

5815 #@+node:ekr.20220401034726.10: *7* tog.MatchValue 

5816 # MatchValue(expr value) 

5817 

5818 def do_MatchValue(self, node: Node) -> None: 

5819 

5820 self.visit(node.value) 

5821 #@+node:ekr.20191113063144.78: *6* tog.Nonlocal 

5822 # Nonlocal(identifier* names) 

5823 

5824 def do_Nonlocal(self, node: Node) -> None: 

5825 

5826 # nonlocal %s\n' % ','.join(node.names)) 

5827 # No need to put commas. 

5828 self.name('nonlocal') 

5829 for z in node.names: 

5830 self.name(z) 

5831 #@+node:ekr.20191113063144.79: *6* tog.Pass 

5832 def do_Pass(self, node: Node) -> None: 

5833 

5834 self.name('pass') 

5835 #@+node:ekr.20191113063144.81: *6* tog.Raise 

5836 # Raise(expr? exc, expr? cause) 

5837 

5838 def do_Raise(self, node: Node) -> None: 

5839 

5840 # No need to put commas. 

5841 self.name('raise') 

5842 exc = getattr(node, 'exc', None) 

5843 cause = getattr(node, 'cause', None) 

5844 tback = getattr(node, 'tback', None) 

5845 self.visit(exc) 

5846 if cause: 

5847 self.name('from') # #2446. 

5848 self.visit(cause) 

5849 self.visit(tback) 

5850 #@+node:ekr.20191113063144.82: *6* tog.Return 

5851 def do_Return(self, node: Node) -> None: 

5852 

5853 self.name('return') 

5854 self.visit(node.value) 

5855 #@+node:ekr.20191113063144.85: *6* tog.Try 

5856 # Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) 

5857 

5858 def do_Try(self, node: Node) -> None: 

5859 

5860 # Try line... 

5861 self.name('try') 

5862 self.op(':') 

5863 # Body... 

5864 self.level += 1 

5865 self.visit(node.body) 

5866 self.visit(node.handlers) 

5867 # Else... 

5868 if node.orelse: 

5869 self.name('else') 

5870 self.op(':') 

5871 self.visit(node.orelse) 

5872 # Finally... 

5873 if node.finalbody: 

5874 self.name('finally') 

5875 self.op(':') 

5876 self.visit(node.finalbody) 

5877 self.level -= 1 

5878 #@+node:ekr.20191113063144.88: *6* tog.While 

5879 def do_While(self, node: Node) -> None: 

5880 

5881 # While line... 

5882 # while %s:\n' 

5883 self.name('while') 

5884 self.visit(node.test) 

5885 self.op(':') 

5886 # Body... 

5887 self.level += 1 

5888 self.visit(node.body) 

5889 # Else clause... 

5890 if node.orelse: 

5891 self.name('else') 

5892 self.op(':') 

5893 self.visit(node.orelse) 

5894 self.level -= 1 

5895 #@+node:ekr.20191113063144.89: *6* tog.With 

5896 # With(withitem* items, stmt* body) 

5897 

5898 # withitem = (expr context_expr, expr? optional_vars) 

5899 

5900 def do_With(self, node: Node) -> None: 

5901 

5902 expr: Optional[ast.AST] = getattr(node, 'context_expression', None) 

5903 items: List[ast.AST] = getattr(node, 'items', []) 

5904 self.name('with') 

5905 self.visit(expr) 

5906 # No need to put commas. 

5907 for item in items: 

5908 self.visit(item.context_expr) 

5909 optional_vars = getattr(item, 'optional_vars', None) 

5910 if optional_vars is not None: 

5911 self.name('as') 

5912 self.visit(item.optional_vars) 

5913 # End the line. 

5914 self.op(':') 

5915 # Body... 

5916 self.level += 1 

5917 self.visit(node.body) 

5918 self.level -= 1 

5919 #@+node:ekr.20191113063144.90: *6* tog.Yield 

5920 def do_Yield(self, node: Node) -> None: 

5921 

5922 self.name('yield') 

5923 if hasattr(node, 'value'): 

5924 self.visit(node.value) 

5925 #@+node:ekr.20191113063144.91: *6* tog.YieldFrom 

5926 # YieldFrom(expr value) 

5927 

5928 def do_YieldFrom(self, node: Node) -> None: 

5929 

5930 self.name('yield') 

5931 self.name('from') 

5932 self.visit(node.value) 

5933 #@-others 

5934#@+node:ekr.20191226195813.1: *3* class TokenOrderTraverser 

5935class TokenOrderTraverser: 

5936 """ 

5937 Traverse an ast tree using the parent/child links created by the 

5938 TokenOrderGenerator class. 

5939 

5940 **Important**: 

5941 

5942 This class is a curio. It is no longer used in this file! 

5943 The Fstringify and ReassignTokens classes now use ast.walk. 

5944 """ 

5945 #@+others 

5946 #@+node:ekr.20191226200154.1: *4* TOT.traverse 

5947 def traverse(self, tree: Node) -> int: 

5948 """ 

5949 Call visit, in token order, for all nodes in tree. 

5950 

5951 Recursion is not allowed. 

5952 

5953 The code follows p.moveToThreadNext exactly. 

5954 """ 

5955 

5956 def has_next(i: int, node: Node, stack: List[int]) -> bool: 

5957 """Return True if stack[i] is a valid child of node.parent.""" 

5958 # g.trace(node.__class__.__name__, stack) 

5959 parent = node.parent 

5960 return bool(parent and parent.children and i < len(parent.children)) 

5961 

5962 # Update stats 

5963 

5964 self.last_node_index = -1 # For visit 

5965 # The stack contains child indices. 

5966 node, stack = tree, [0] 

5967 seen = set() 

5968 while node and stack: 

5969 # g.trace( 

5970 # f"{node.node_index:>3} " 

5971 # f"{node.__class__.__name__:<12} {stack}") 

5972 # Visit the node. 

5973 assert node.node_index not in seen, node.node_index 

5974 seen.add(node.node_index) 

5975 self.visit(node) 

5976 # if p.v.children: p.moveToFirstChild() 

5977 children: List[ast.AST] = getattr(node, 'children', []) 

5978 if children: 

5979 # Move to the first child. 

5980 stack.append(0) 

5981 node = children[0] 

5982 # g.trace(' child:', node.__class__.__name__, stack) 

5983 continue 

5984 # elif p.hasNext(): p.moveToNext() 

5985 stack[-1] += 1 

5986 i = stack[-1] 

5987 if has_next(i, node, stack): 

5988 node = node.parent.children[i] 

5989 continue 

5990 # else... 

5991 # p.moveToParent() 

5992 node = node.parent 

5993 stack.pop() 

5994 # while p: 

5995 while node and stack: 

5996 # if p.hasNext(): 

5997 stack[-1] += 1 

5998 i = stack[-1] 

5999 if has_next(i, node, stack): 

6000 # Move to the next sibling. 

6001 node = node.parent.children[i] 

6002 break # Found. 

6003 # p.moveToParent() 

6004 node = node.parent 

6005 stack.pop() 

6006 # not found. 

6007 else: 

6008 break # pragma: no cover 

6009 return self.last_node_index 

6010 #@+node:ekr.20191227160547.1: *4* TOT.visit 

6011 def visit(self, node: Node) -> None: 

6012 

6013 self.last_node_index += 1 

6014 assert self.last_node_index == node.node_index, ( 

6015 self.last_node_index, node.node_index) 

6016 #@-others 

6017#@-others 

6018g = LeoGlobals() 

6019if __name__ == '__main__': 

6020 main() # pragma: no cover 

6021#@@language python 

6022#@@tabwidth -4 

6023#@@pagewidth 70 

6024#@-leo