Coverage for C:\Repos\leo-editor\leo\core\leoAst.py: 99%
1996 statements
« prev ^ index » next coverage.py v6.4, created at 2022-05-24 10:21 -0500
« prev ^ index » next coverage.py v6.4, created at 2022-05-24 10:21 -0500
1# -*- coding: utf-8 -*-
2#@+leo-ver=5-thin
3#@+node:ekr.20141012064706.18389: * @file leoAst.py
4#@@first
5# This file is part of Leo: https://leoeditor.com
6# Leo's copyright notice is based on the MIT license: http://leoeditor.com/license.html
8# For now, suppress all mypy checks
9# type: ignore
10#@+<< docstring >>
11#@+node:ekr.20200113081838.1: ** << docstring >> (leoAst.py)
12"""
13leoAst.py: This file does not depend on Leo in any way.
15The classes in this file unify python's token-based and ast-based worlds by
16creating two-way links between tokens in the token list and ast nodes in
17the parse tree. For more details, see the "Overview" section below.
20**Stand-alone operation**
22usage:
23 leoAst.py --help
24 leoAst.py [--fstringify | --fstringify-diff | --orange | --orange-diff] PATHS
25 leoAst.py --py-cov [ARGS]
26 leoAst.py --pytest [ARGS]
27 leoAst.py --unittest [ARGS]
29examples:
30 --py-cov "-f TestOrange"
31 --pytest "-f TestOrange"
32 --unittest TestOrange
34positional arguments:
35 PATHS directory or list of files
37optional arguments:
38 -h, --help show this help message and exit
39 --fstringify leonine fstringify
40 --fstringify-diff show fstringify diff
41 --orange leonine Black
42 --orange-diff show orange diff
43 --py-cov run pytest --cov on leoAst.py
44 --pytest run pytest on leoAst.py
45 --unittest run unittest on leoAst.py
48**Overview**
50leoAst.py unifies python's token-oriented and ast-oriented worlds.
52leoAst.py defines classes that create two-way links between tokens
53created by python's tokenize module and parse tree nodes created by
54python's ast module:
56The Token Order Generator (TOG) class quickly creates the following
57links:
59- An *ordered* children array from each ast node to its children.
61- A parent link from each ast.node to its parent.
63- Two-way links between tokens in the token list, a list of Token
64 objects, and the ast nodes in the parse tree:
66 - For each token, token.node contains the ast.node "responsible" for
67 the token.
69 - For each ast node, node.first_i and node.last_i are indices into
70 the token list. These indices give the range of tokens that can be
71 said to be "generated" by the ast node.
73Once the TOG class has inserted parent/child links, the Token Order
74Traverser (TOT) class traverses trees annotated with parent/child
75links extremely quickly.
78**Applicability and importance**
80Many python developers will find asttokens meets all their needs.
81asttokens is well documented and easy to use. Nevertheless, two-way
82links are significant additions to python's tokenize and ast modules:
84- Links from tokens to nodes are assigned to the nearest possible ast
85 node, not the nearest statement, as in asttokens. Links can easily
86 be reassigned, if desired.
88- The TOG and TOT classes are intended to be the foundation of tools
89 such as fstringify and black.
91- The TOG class solves real problems, such as:
92 https://stackoverflow.com/questions/16748029/
94**Known bug**
96This file has no known bugs *except* for Python version 3.8.
98For Python 3.8, syncing tokens will fail for function call such as:
100 f(1, x=2, *[3, 4], y=5)
102that is, for calls where keywords appear before non-keyword args.
104There are no plans to fix this bug. The workaround is to use Python version
1053.9 or above.
108**Figures of merit**
110Simplicity: The code consists primarily of a set of generators, one
111for every kind of ast node.
113Speed: The TOG creates two-way links between tokens and ast nodes in
114roughly the time taken by python's tokenize.tokenize and ast.parse
115library methods. This is substantially faster than the asttokens,
116black or fstringify tools. The TOT class traverses trees annotated
117with parent/child links even more quickly.
119Memory: The TOG class makes no significant demands on python's
120resources. Generators add nothing to python's call stack.
121TOG.node_stack is the only variable-length data. This stack resides in
122python's heap, so its length is unimportant. In the worst case, it
123might contain a few thousand entries. The TOT class uses no
124variable-length data at all.
126**Links**
128Leo...
129Ask for help: https://groups.google.com/forum/#!forum/leo-editor
130Report a bug: https://github.com/leo-editor/leo-editor/issues
131leoAst.py docs: http://leoeditor.com/appendices.html#leoast-py
133Other tools...
134asttokens: https://pypi.org/project/asttokens
135black: https://pypi.org/project/black/
136fstringify: https://pypi.org/project/fstringify/
138Python modules...
139tokenize.py: https://docs.python.org/3/library/tokenize.html
140ast.py https://docs.python.org/3/library/ast.html
142**Studying this file**
144I strongly recommend that you use Leo when studying this code so that you
145will see the file's intended outline structure.
147Without Leo, you will see only special **sentinel comments** that create
148Leo's outline structure. These comments have the form::
150 `#@<comment-kind>:<user-id>.<timestamp>.<number>: <outline-level> <headline>`
151"""
152#@-<< docstring >>
153#@+<< imports >>
154#@+node:ekr.20200105054219.1: ** << imports >> (leoAst.py)
155import argparse
156import ast
157import codecs
158import difflib
159import glob
160import io
161import os
162import re
163import sys
164import textwrap
165import tokenize
166import traceback
167from typing import Any, Callable, Dict, Generator, List, Optional, Tuple, Union
168#@-<< imports >>
169Node = ast.AST
170ActionList = List[Tuple[Callable, Any]]
171v1, v2, junk1, junk2, junk3 = sys.version_info
172py_version = (v1, v2)
174# Async tokens exist only in Python 3.5 and 3.6.
175# https://docs.python.org/3/library/token.html
176has_async_tokens = (3, 5) <= py_version <= (3, 6)
178# has_position_only_params = (v1, v2) >= (3, 8)
179#@+others
180#@+node:ekr.20191226175251.1: ** class LeoGlobals
181#@@nosearch
184class LeoGlobals: # pragma: no cover
185 """
186 Simplified version of functions in leoGlobals.py.
187 """
189 total_time = 0.0 # For unit testing.
191 #@+others
192 #@+node:ekr.20191226175903.1: *3* LeoGlobals.callerName
193 def callerName(self, n: int) -> str:
194 """Get the function name from the call stack."""
195 try:
196 f1 = sys._getframe(n)
197 code1 = f1.f_code
198 return code1.co_name
199 except Exception:
200 return ''
201 #@+node:ekr.20191226175426.1: *3* LeoGlobals.callers
202 def callers(self, n: int=4) -> str:
203 """
204 Return a string containing a comma-separated list of the callers
205 of the function that called g.callerList.
206 """
207 i, result = 2, []
208 while True:
209 s = self.callerName(n=i)
210 if s:
211 result.append(s)
212 if not s or len(result) >= n:
213 break
214 i += 1
215 return ','.join(reversed(result))
216 #@+node:ekr.20191226190709.1: *3* leoGlobals.es_exception & helper
217 def es_exception(self, full: bool=True) -> Tuple[str, int]:
218 typ, val, tb = sys.exc_info()
219 for line in traceback.format_exception(typ, val, tb):
220 print(line)
221 fileName, n = self.getLastTracebackFileAndLineNumber()
222 return fileName, n
223 #@+node:ekr.20191226192030.1: *4* LeoGlobals.getLastTracebackFileAndLineNumber
224 def getLastTracebackFileAndLineNumber(self) -> Tuple[str, int]:
225 typ, val, tb = sys.exc_info()
226 if typ == SyntaxError:
227 # IndentationError is a subclass of SyntaxError.
228 # SyntaxError *does* have 'filename' and 'lineno' attributes.
229 return val.filename, val.lineno
230 #
231 # Data is a list of tuples, one per stack entry.
232 # The tuples have the form (filename, lineNumber, functionName, text).
233 data = traceback.extract_tb(tb)
234 item = data[-1] # Get the item at the top of the stack.
235 filename, n, functionName, text = item
236 return filename, n
237 #@+node:ekr.20200220065737.1: *3* LeoGlobals.objToString
238 def objToString(self, obj: Any, tag: str=None) -> str:
239 """Simplified version of g.printObj."""
240 result = []
241 if tag:
242 result.append(f"{tag}...")
243 if isinstance(obj, str):
244 obj = g.splitLines(obj)
245 if isinstance(obj, list):
246 result.append('[')
247 for z in obj:
248 result.append(f" {z!r}")
249 result.append(']')
250 elif isinstance(obj, tuple):
251 result.append('(')
252 for z in obj:
253 result.append(f" {z!r}")
254 result.append(')')
255 else:
256 result.append(repr(obj))
257 result.append('')
258 return '\n'.join(result)
259 #@+node:ekr.20220327132500.1: *3* LeoGlobals.pdb
260 def pdb(self) -> None:
261 import pdb as _pdb
262 # pylint: disable=forgotten-debug-statement
263 _pdb.set_trace()
264 #@+node:ekr.20191226190425.1: *3* LeoGlobals.plural
265 def plural(self, obj: Any) -> str:
266 """Return "s" or "" depending on n."""
267 if isinstance(obj, (list, tuple, str)):
268 n = len(obj)
269 else:
270 n = obj
271 return '' if n == 1 else 's'
272 #@+node:ekr.20191226175441.1: *3* LeoGlobals.printObj
273 def printObj(self, obj: Any, tag: str=None) -> None:
274 """Simplified version of g.printObj."""
275 print(self.objToString(obj, tag))
276 #@+node:ekr.20220327120618.1: *3* LeoGlobals.shortFileName
277 def shortFileName(self, fileName: str) -> str:
278 """Return the base name of a path."""
279 return os.path.basename(fileName) if fileName else ''
280 #@+node:ekr.20191226190131.1: *3* LeoGlobals.splitLines
281 def splitLines(self, s: str) -> List[str]:
282 """Split s into lines, preserving the number of lines and
283 the endings of all lines, including the last line."""
284 # g.stat()
285 if s:
286 return s.splitlines(True) # This is a Python string function!
287 return []
288 #@+node:ekr.20191226190844.1: *3* LeoGlobals.toEncodedString
289 def toEncodedString(self, s: Any, encoding: str='utf-8') -> bytes:
290 """Convert unicode string to an encoded string."""
291 if not isinstance(s, str):
292 return s
293 try:
294 s = s.encode(encoding, "strict")
295 except UnicodeError:
296 s = s.encode(encoding, "replace")
297 print(f"toEncodedString: Error converting {s!r} to {encoding}")
298 return s
299 #@+node:ekr.20191226190006.1: *3* LeoGlobals.toUnicode
300 def toUnicode(self, s: Any, encoding: str='utf-8') -> str:
301 """Convert bytes to unicode if necessary."""
302 tag = 'g.toUnicode'
303 if isinstance(s, str):
304 return s
305 if not isinstance(s, bytes):
306 print(f"{tag}: bad s: {s!r}")
307 return ''
308 b: bytes = s
309 try:
310 s2 = b.decode(encoding, 'strict')
311 except(UnicodeDecodeError, UnicodeError):
312 s2 = b.decode(encoding, 'replace')
313 print(f"{tag}: unicode error. encoding: {encoding!r}, s2:\n{s2!r}")
314 g.trace(g.callers())
315 except Exception:
316 g.es_exception()
317 print(f"{tag}: unexpected error! encoding: {encoding!r}, s2:\n{s2!r}")
318 g.trace(g.callers())
319 return s2
320 #@+node:ekr.20191226175436.1: *3* LeoGlobals.trace
321 def trace(self, *args: Any) -> None:
322 """Print a tracing message."""
323 # Compute the caller name.
324 try:
325 f1 = sys._getframe(1)
326 code1 = f1.f_code
327 name = code1.co_name
328 except Exception:
329 name = ''
330 print(f"{name}: {' '.join(str(z) for z in args)}")
331 #@+node:ekr.20191226190241.1: *3* LeoGlobals.truncate
332 def truncate(self, s: str, n: int) -> str:
333 """Return s truncated to n characters."""
334 if len(s) <= n:
335 return s
336 s2 = s[: n - 3] + f"...({len(s)})"
337 return s2 + '\n' if s.endswith('\n') else s2
338 #@-others
339#@+node:ekr.20200702114522.1: ** leoAst.py: top-level commands
340#@+node:ekr.20200702114557.1: *3* command: fstringify_command
341def fstringify_command(files: List[str]) -> None:
342 """
343 Entry point for --fstringify.
345 Fstringify the given file, overwriting the file.
346 """
347 for filename in files: # pragma: no cover
348 if os.path.exists(filename):
349 print(f"fstringify {filename}")
350 Fstringify().fstringify_file_silent(filename)
351 else:
352 print(f"file not found: {filename}")
353#@+node:ekr.20200702121222.1: *3* command: fstringify_diff_command
354def fstringify_diff_command(files: List[str]) -> None:
355 """
356 Entry point for --fstringify-diff.
358 Print the diff that would be produced by fstringify.
359 """
360 for filename in files: # pragma: no cover
361 if os.path.exists(filename):
362 print(f"fstringify-diff {filename}")
363 Fstringify().fstringify_file_diff(filename)
364 else:
365 print(f"file not found: {filename}")
366#@+node:ekr.20200702115002.1: *3* command: orange_command
367def orange_command(files: List[str], settings: Optional[Dict[str, Any]]=None) -> None:
369 for filename in files: # pragma: no cover
370 if os.path.exists(filename):
371 # print(f"orange {filename}")
372 Orange(settings).beautify_file(filename)
373 else:
374 print(f"file not found: {filename}")
375 print(f"Beautify done: {len(files)} files")
376#@+node:ekr.20200702121315.1: *3* command: orange_diff_command
377def orange_diff_command(files: List[str], settings: Optional[Dict[str, Any]]=None) -> None:
379 for filename in files: # pragma: no cover
380 if os.path.exists(filename):
381 print(f"orange-diff {filename}")
382 Orange(settings).beautify_file_diff(filename)
383 else:
384 print(f"file not found: {filename}")
385#@+node:ekr.20160521104628.1: ** leoAst.py: top-level utils
386if 1: # pragma: no cover
387 #@+others
388 #@+node:ekr.20200702102239.1: *3* function: main (leoAst.py)
389 def main() -> None:
390 """Run commands specified by sys.argv."""
391 args, settings_dict, arg_files, recursive = scan_ast_args()
392 # Finalizie arguments.
393 cwd, files = os.getcwd(), []
394 for path in arg_files:
395 root_dir = os.path.join(cwd, path)
396 files = glob.glob(f'{root_dir}**{os.sep}*.py', recursive=recursive)
397 if not files:
398 print('No files found')
399 return
400 # Execute the command.
401 print(f"Found {len(files)} file{g.plural(len(files))}.")
402 if args.f:
403 fstringify_command(files)
404 if args.fd:
405 fstringify_diff_command(files)
406 if args.o:
407 orange_command(files, settings_dict)
408 if args.od:
409 orange_diff_command(files, settings_dict)
410 #@+node:ekr.20220404062739.1: *3* function: scan_ast_args
411 def scan_ast_args() -> Tuple[Any, Dict[str, Any], List[str], bool]:
412 description = textwrap.dedent("""\
413 Execute fstringify or beautify commands contained in leoAst.py.
414 """)
415 parser = argparse.ArgumentParser(
416 description=description,
417 formatter_class=argparse.RawTextHelpFormatter)
418 parser.add_argument('PATHS', nargs='*', help='directory or list of files')
419 group = parser.add_mutually_exclusive_group(required=False) # Don't require any args.
420 add = group.add_argument
421 add('--fstringify', dest='f', action='store_true',
422 help='fstringify PATHS')
423 add('--fstringify-diff', dest='fd', action='store_true',
424 help='fstringify diff PATHS')
425 add('--orange', dest='o', action='store_true',
426 help='beautify PATHS')
427 add('--orange-diff', dest='od', action='store_true',
428 help='diff beautify PATHS')
429 # New arguments.
430 add2 = parser.add_argument
431 add2('--allow-joined', dest='allow_joined', action='store_true',
432 help='allow joined strings')
433 add2('--max-join', dest='max_join', metavar='N', type=int,
434 help='max unsplit line length (default 0)')
435 add2('--max-split', dest='max_split', metavar='N', type=int,
436 help='max unjoined line length (default 0)')
437 add2('--recursive', dest='recursive', action='store_true',
438 help='include directories recursively')
439 add2('--tab-width', dest='tab_width', metavar='N', type=int,
440 help='tab-width (default -4)')
441 # Create the return values, using EKR's prefs as the defaults.
442 parser.set_defaults(
443 allow_joined=False,
444 max_join=0,
445 max_split=0,
446 recursive=False,
447 tab_width=4,
448 )
449 args = parser.parse_args()
450 files = args.PATHS
451 recursive = args.recursive
452 # Create the settings dict, ensuring proper values.
453 settings_dict: Dict[str, Any] = {
454 'allow_joined_strings': bool(args.allow_joined),
455 'max_join_line_length': abs(args.max_join),
456 'max_split_line_length': abs(args.max_split),
457 'tab_width': abs(args.tab_width), # Must be positive!
458 }
459 return args, settings_dict, files, recursive
460 #@+node:ekr.20200107114409.1: *3* functions: reading & writing files
461 #@+node:ekr.20200218071822.1: *4* function: regularize_nls
462 def regularize_nls(s: str) -> str:
463 """Regularize newlines within s."""
464 return s.replace('\r\n', '\n').replace('\r', '\n')
465 #@+node:ekr.20200106171502.1: *4* function: get_encoding_directive
466 # This is the pattern in PEP 263.
467 encoding_pattern = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)')
469 def get_encoding_directive(bb: bytes) -> str:
470 """
471 Get the encoding from the encoding directive at the start of a file.
473 bb: The bytes of the file.
475 Returns the codec name, or 'UTF-8'.
477 Adapted from pyzo. Copyright 2008 to 2020 by Almar Klein.
478 """
479 for line in bb.split(b'\n', 2)[:2]:
480 # Try to make line a string
481 try:
482 line2 = line.decode('ASCII').strip()
483 except Exception:
484 continue
485 # Does the line match the PEP 263 pattern?
486 m = encoding_pattern.match(line2)
487 if not m:
488 continue
489 # Is it a known encoding? Correct the name if it is.
490 try:
491 c = codecs.lookup(m.group(1))
492 return c.name
493 except Exception:
494 pass
495 return 'UTF-8'
496 #@+node:ekr.20200103113417.1: *4* function: read_file
497 def read_file(filename: str, encoding: str='utf-8') -> Optional[str]:
498 """
499 Return the contents of the file with the given name.
500 Print an error message and return None on error.
501 """
502 tag = 'read_file'
503 try:
504 # Translate all newlines to '\n'.
505 with open(filename, 'r', encoding=encoding) as f:
506 s = f.read()
507 return regularize_nls(s)
508 except Exception:
509 print(f"{tag}: can not read {filename}")
510 return None
511 #@+node:ekr.20200106173430.1: *4* function: read_file_with_encoding
512 def read_file_with_encoding(filename: str) -> Tuple[str, str]:
513 """
514 Read the file with the given name, returning (e, s), where:
516 s is the string, converted to unicode, or '' if there was an error.
518 e is the encoding of s, computed in the following order:
520 - The BOM encoding if the file starts with a BOM mark.
521 - The encoding given in the # -*- coding: utf-8 -*- line.
522 - The encoding given by the 'encoding' keyword arg.
523 - 'utf-8'.
524 """
525 # First, read the file.
526 tag = 'read_with_encoding'
527 try:
528 with open(filename, 'rb') as f:
529 bb = f.read()
530 except Exception:
531 print(f"{tag}: can not read {filename}")
532 if not bb:
533 return 'UTF-8', ''
534 # Look for the BOM.
535 e, bb = strip_BOM(bb)
536 if not e:
537 # Python's encoding comments override everything else.
538 e = get_encoding_directive(bb)
539 s = g.toUnicode(bb, encoding=e)
540 s = regularize_nls(s)
541 return e, s
542 #@+node:ekr.20200106174158.1: *4* function: strip_BOM
543 def strip_BOM(bb: bytes) -> Tuple[Optional[str], bytes]:
544 """
545 bb must be the bytes contents of a file.
547 If bb starts with a BOM (Byte Order Mark), return (e, bb2), where:
549 - e is the encoding implied by the BOM.
550 - bb2 is bb, stripped of the BOM.
552 If there is no BOM, return (None, bb)
553 """
554 assert isinstance(bb, bytes), bb.__class__.__name__
555 table = (
556 # Test longer bom's first.
557 (4, 'utf-32', codecs.BOM_UTF32_BE),
558 (4, 'utf-32', codecs.BOM_UTF32_LE),
559 (3, 'utf-8', codecs.BOM_UTF8),
560 (2, 'utf-16', codecs.BOM_UTF16_BE),
561 (2, 'utf-16', codecs.BOM_UTF16_LE),
562 )
563 for n, e, bom in table:
564 assert len(bom) == n
565 if bom == bb[: len(bom)]:
566 return e, bb[len(bom) :]
567 return None, bb
568 #@+node:ekr.20200103163100.1: *4* function: write_file
569 def write_file(filename: str, s: str, encoding: str='utf-8') -> None:
570 """
571 Write the string s to the file whose name is given.
573 Handle all exeptions.
575 Before calling this function, the caller should ensure
576 that the file actually has been changed.
577 """
578 try:
579 # Write the file with platform-dependent newlines.
580 with open(filename, 'w', encoding=encoding) as f:
581 f.write(s)
582 except Exception as e:
583 g.trace(f"Error writing {filename}\n{e}")
584 #@+node:ekr.20200113154120.1: *3* functions: tokens
585 #@+node:ekr.20191223093539.1: *4* function: find_anchor_token
586 def find_anchor_token(node: Node, global_token_list: List["Token"]) -> Optional["Token"]:
587 """
588 Return the anchor_token for node, a token such that token.node == node.
590 The search starts at node, and then all the usual child nodes.
591 """
593 node1 = node
595 def anchor_token(node: Node) -> Optional["Token"]:
596 """Return the anchor token in node.token_list"""
597 # Careful: some tokens in the token list may have been killed.
598 for token in get_node_token_list(node, global_token_list):
599 if is_ancestor(node1, token):
600 return token
601 return None
603 # This table only has to cover fields for ast.Nodes that
604 # won't have any associated token.
606 fields = (
607 # Common...
608 'elt', 'elts', 'body', 'value', # Less common...
609 'dims', 'ifs', 'names', 's',
610 'test', 'values', 'targets',
611 )
612 while node:
613 # First, try the node itself.
614 token = anchor_token(node)
615 if token:
616 return token
617 # Second, try the most common nodes w/o token_lists:
618 if isinstance(node, ast.Call):
619 node = node.func
620 elif isinstance(node, ast.Tuple):
621 node = node.elts # type:ignore
622 # Finally, try all other nodes.
623 else:
624 # This will be used rarely.
625 for field in fields:
626 node = getattr(node, field, None)
627 if node:
628 token = anchor_token(node)
629 if token:
630 return token
631 else:
632 break
633 return None
634 #@+node:ekr.20191231160225.1: *4* function: find_paren_token (changed signature)
635 def find_paren_token(i: int, global_token_list: List["Token"]) -> int:
636 """Return i of the next paren token, starting at tokens[i]."""
637 while i < len(global_token_list):
638 token = global_token_list[i]
639 if token.kind == 'op' and token.value in '()':
640 return i
641 if is_significant_token(token):
642 break
643 i += 1
644 return None
645 #@+node:ekr.20200113110505.4: *4* function: get_node_tokens_list
646 def get_node_token_list(node: Node, global_tokens_list: List["Token"]) -> List["Token"]:
647 """
648 tokens_list must be the global tokens list.
649 Return the tokens assigned to the node, or [].
650 """
651 i = getattr(node, 'first_i', None)
652 j = getattr(node, 'last_i', None)
653 return [] if i is None else global_tokens_list[i : j + 1]
654 #@+node:ekr.20191124123830.1: *4* function: is_significant & is_significant_token
655 def is_significant(kind: str, value: str) -> bool:
656 """
657 Return True if (kind, value) represent a token that can be used for
658 syncing generated tokens with the token list.
659 """
660 # Making 'endmarker' significant ensures that all tokens are synced.
661 return (
662 kind in ('async', 'await', 'endmarker', 'name', 'number', 'string') or
663 kind == 'op' and value not in ',;()')
665 def is_significant_token(token: "Token") -> bool:
666 """Return True if the given token is a syncronizing token"""
667 return is_significant(token.kind, token.value)
668 #@+node:ekr.20191224093336.1: *4* function: match_parens
669 def match_parens(filename: str, i: int, j: int, tokens: List["Token"]) -> int:
670 """Match parens in tokens[i:j]. Return the new j."""
671 if j >= len(tokens):
672 return len(tokens)
673 # Calculate paren level...
674 level = 0
675 for n in range(i, j + 1):
676 token = tokens[n]
677 if token.kind == 'op' and token.value == '(':
678 level += 1
679 if token.kind == 'op' and token.value == ')':
680 if level == 0:
681 break
682 level -= 1
683 # Find matching ')' tokens *after* j.
684 if level > 0:
685 while level > 0 and j + 1 < len(tokens):
686 token = tokens[j + 1]
687 if token.kind == 'op' and token.value == ')':
688 level -= 1
689 elif token.kind == 'op' and token.value == '(':
690 level += 1
691 elif is_significant_token(token):
692 break
693 j += 1
694 if level != 0: # pragma: no cover.
695 line_n = tokens[i].line_number
696 raise AssignLinksError(
697 f"\n"
698 f"Unmatched parens: level={level}\n"
699 f" file: {filename}\n"
700 f" line: {line_n}\n")
701 return j
702 #@+node:ekr.20191223053324.1: *4* function: tokens_for_node
703 def tokens_for_node(filename: str, node: Node, global_token_list: List["Token"]) -> List["Token"]:
704 """Return the list of all tokens descending from node."""
705 # Find any token descending from node.
706 token = find_anchor_token(node, global_token_list)
707 if not token:
708 if 0: # A good trace for debugging.
709 print('')
710 g.trace('===== no tokens', node.__class__.__name__)
711 return []
712 assert is_ancestor(node, token)
713 # Scan backward.
714 i = first_i = token.index
715 while i >= 0:
716 token2 = global_token_list[i - 1]
717 if getattr(token2, 'node', None):
718 if is_ancestor(node, token2):
719 first_i = i - 1
720 else:
721 break
722 i -= 1
723 # Scan forward.
724 j = last_j = token.index
725 while j + 1 < len(global_token_list):
726 token2 = global_token_list[j + 1]
727 if getattr(token2, 'node', None):
728 if is_ancestor(node, token2):
729 last_j = j + 1
730 else:
731 break
732 j += 1
733 last_j = match_parens(filename, first_i, last_j, global_token_list)
734 results = global_token_list[first_i : last_j + 1]
735 return results
736 #@+node:ekr.20200101030236.1: *4* function: tokens_to_string
737 def tokens_to_string(tokens: List[Any]) -> str:
738 """Return the string represented by the list of tokens."""
739 if tokens is None:
740 # This indicates an internal error.
741 print('')
742 g.trace('===== token list is None ===== ')
743 print('')
744 return ''
745 return ''.join([z.to_string() for z in tokens])
746 #@+node:ekr.20191223095408.1: *3* node/token nodes...
747 # Functions that associate tokens with nodes.
748 #@+node:ekr.20200120082031.1: *4* function: find_statement_node
749 def find_statement_node(node: Node) -> Optional[Node]:
750 """
751 Return the nearest statement node.
752 Return None if node has only Module for a parent.
753 """
754 if isinstance(node, ast.Module):
755 return None
756 parent = node
757 while parent:
758 if is_statement_node(parent):
759 return parent
760 parent = parent.parent
761 return None
762 #@+node:ekr.20191223054300.1: *4* function: is_ancestor
763 def is_ancestor(node: Node, token: "Token") -> bool:
764 """Return True if node is an ancestor of token."""
765 t_node = token.node
766 if not t_node:
767 assert token.kind == 'killed', repr(token)
768 return False
769 while t_node:
770 if t_node == node:
771 return True
772 t_node = t_node.parent
773 return False
774 #@+node:ekr.20200120082300.1: *4* function: is_long_statement
775 def is_long_statement(node: Node) -> bool:
776 """
777 Return True if node is an instance of a node that might be split into
778 shorter lines.
779 """
780 return isinstance(node, (
781 ast.Assign, ast.AnnAssign, ast.AsyncFor, ast.AsyncWith, ast.AugAssign,
782 ast.Call, ast.Delete, ast.ExceptHandler, ast.For, ast.Global,
783 ast.If, ast.Import, ast.ImportFrom,
784 ast.Nonlocal, ast.Return, ast.While, ast.With, ast.Yield, ast.YieldFrom))
785 #@+node:ekr.20200120110005.1: *4* function: is_statement_node
786 def is_statement_node(node: Node) -> bool:
787 """Return True if node is a top-level statement."""
788 return is_long_statement(node) or isinstance(node, (
789 ast.Break, ast.Continue, ast.Pass, ast.Try))
790 #@+node:ekr.20191231082137.1: *4* function: nearest_common_ancestor
791 def nearest_common_ancestor(node1: Node, node2: Node) -> Optional[Node]:
792 """
793 Return the nearest common ancestor node for the given nodes.
795 The nodes must have parent links.
796 """
798 def parents(node: Node) -> List[Node]:
799 aList = []
800 while node:
801 aList.append(node)
802 node = node.parent
803 return list(reversed(aList))
805 result = None
806 parents1 = parents(node1)
807 parents2 = parents(node2)
808 while parents1 and parents2:
809 parent1 = parents1.pop(0)
810 parent2 = parents2.pop(0)
811 if parent1 == parent2:
812 result = parent1
813 else:
814 break
815 return result
816 #@+node:ekr.20191231072039.1: *3* functions: utils...
817 # General utility functions on tokens and nodes.
818 #@+node:ekr.20191119085222.1: *4* function: obj_id
819 def obj_id(obj: Any) -> str:
820 """Return the last four digits of id(obj), for dumps & traces."""
821 return str(id(obj))[-4:]
822 #@+node:ekr.20191231060700.1: *4* function: op_name
823 #@@nobeautify
825 # https://docs.python.org/3/library/ast.html
827 _op_names = {
828 # Binary operators.
829 'Add': '+',
830 'BitAnd': '&',
831 'BitOr': '|',
832 'BitXor': '^',
833 'Div': '/',
834 'FloorDiv': '//',
835 'LShift': '<<',
836 'MatMult': '@', # Python 3.5.
837 'Mod': '%',
838 'Mult': '*',
839 'Pow': '**',
840 'RShift': '>>',
841 'Sub': '-',
842 # Boolean operators.
843 'And': ' and ',
844 'Or': ' or ',
845 # Comparison operators
846 'Eq': '==',
847 'Gt': '>',
848 'GtE': '>=',
849 'In': ' in ',
850 'Is': ' is ',
851 'IsNot': ' is not ',
852 'Lt': '<',
853 'LtE': '<=',
854 'NotEq': '!=',
855 'NotIn': ' not in ',
856 # Context operators.
857 'AugLoad': '<AugLoad>',
858 'AugStore': '<AugStore>',
859 'Del': '<Del>',
860 'Load': '<Load>',
861 'Param': '<Param>',
862 'Store': '<Store>',
863 # Unary operators.
864 'Invert': '~',
865 'Not': ' not ',
866 'UAdd': '+',
867 'USub': '-',
868 }
870 def op_name(node: Node) -> str:
871 """Return the print name of an operator node."""
872 class_name = node.__class__.__name__
873 assert class_name in _op_names, repr(class_name)
874 return _op_names[class_name].strip()
875 #@+node:ekr.20200107114452.1: *3* node/token creators...
876 #@+node:ekr.20200103082049.1: *4* function: make_tokens
877 def make_tokens(contents: str) -> List["Token"]:
878 """
879 Return a list (not a generator) of Token objects corresponding to the
880 list of 5-tuples generated by tokenize.tokenize.
882 Perform consistency checks and handle all exeptions.
883 """
885 def check(contents: str, tokens: List["Token"]) -> bool:
886 result = tokens_to_string(tokens)
887 ok = result == contents
888 if not ok:
889 print('\nRound-trip check FAILS')
890 print('Contents...\n')
891 g.printObj(contents)
892 print('\nResult...\n')
893 g.printObj(result)
894 return ok
896 try:
897 five_tuples = tokenize.tokenize(
898 io.BytesIO(contents.encode('utf-8')).readline)
899 except Exception:
900 print('make_tokens: exception in tokenize.tokenize')
901 g.es_exception()
902 return None
903 tokens = Tokenizer().create_input_tokens(contents, five_tuples)
904 assert check(contents, tokens)
905 return tokens
906 #@+node:ekr.20191027075648.1: *4* function: parse_ast
907 def parse_ast(s: str) -> Optional[Node]:
908 """
909 Parse string s, catching & reporting all exceptions.
910 Return the ast node, or None.
911 """
913 def oops(message: str) -> None:
914 print('')
915 print(f"parse_ast: {message}")
916 g.printObj(s)
917 print('')
919 try:
920 s1 = g.toEncodedString(s)
921 tree = ast.parse(s1, filename='before', mode='exec')
922 return tree
923 except IndentationError:
924 oops('Indentation Error')
925 except SyntaxError:
926 oops('Syntax Error')
927 except Exception:
928 oops('Unexpected Exception')
929 g.es_exception()
930 return None
931 #@+node:ekr.20191231110051.1: *3* node/token dumpers...
932 #@+node:ekr.20191027074436.1: *4* function: dump_ast
933 def dump_ast(ast: Node, tag: str='dump_ast') -> None:
934 """Utility to dump an ast tree."""
935 g.printObj(AstDumper().dump_ast(ast), tag=tag)
936 #@+node:ekr.20191228095945.4: *4* function: dump_contents
937 def dump_contents(contents: str, tag: str='Contents') -> None:
938 print('')
939 print(f"{tag}...\n")
940 for i, z in enumerate(g.splitLines(contents)):
941 print(f"{i+1:<3} ", z.rstrip())
942 print('')
943 #@+node:ekr.20191228095945.5: *4* function: dump_lines
944 def dump_lines(tokens: List["Token"], tag: str='Token lines') -> None:
945 print('')
946 print(f"{tag}...\n")
947 for z in tokens:
948 if z.line.strip():
949 print(z.line.rstrip())
950 else:
951 print(repr(z.line))
952 print('')
953 #@+node:ekr.20191228095945.7: *4* function: dump_results
954 def dump_results(tokens: List["Token"], tag: str='Results') -> None:
955 print('')
956 print(f"{tag}...\n")
957 print(tokens_to_string(tokens))
958 print('')
959 #@+node:ekr.20191228095945.8: *4* function: dump_tokens
960 def dump_tokens(tokens: List["Token"], tag: str='Tokens') -> None:
961 print('')
962 print(f"{tag}...\n")
963 if not tokens:
964 return
965 print("Note: values shown are repr(value) *except* for 'string' tokens.")
966 tokens[0].dump_header()
967 for i, z in enumerate(tokens):
968 # Confusing.
969 # if (i % 20) == 0: z.dump_header()
970 print(z.dump())
971 print('')
972 #@+node:ekr.20191228095945.9: *4* function: dump_tree
973 def dump_tree(tokens: List["Token"], tree: Node, tag: str='Tree') -> None:
974 print('')
975 print(f"{tag}...\n")
976 print(AstDumper().dump_tree(tokens, tree))
977 #@+node:ekr.20200107040729.1: *4* function: show_diffs
978 def show_diffs(s1: str, s2: str, filename: str='') -> None:
979 """Print diffs between strings s1 and s2."""
980 lines = list(difflib.unified_diff(
981 g.splitLines(s1),
982 g.splitLines(s2),
983 fromfile=f"Old {filename}",
984 tofile=f"New {filename}",
985 ))
986 print('')
987 tag = f"Diffs for {filename}" if filename else 'Diffs'
988 g.printObj(lines, tag=tag)
989 #@+node:ekr.20191225061516.1: *3* node/token replacers...
990 # Functions that replace tokens or nodes.
991 #@+node:ekr.20191231162249.1: *4* function: add_token_to_token_list
992 def add_token_to_token_list(token: "Token", node: Node) -> None:
993 """Insert token in the proper location of node.token_list."""
994 if getattr(node, 'first_i', None) is None:
995 node.first_i = node.last_i = token.index
996 else:
997 node.first_i = min(node.first_i, token.index)
998 node.last_i = max(node.last_i, token.index)
999 #@+node:ekr.20191225055616.1: *4* function: replace_node
1000 def replace_node(new_node: Node, old_node: Node) -> None:
1001 """Replace new_node by old_node in the parse tree."""
1002 parent = old_node.parent
1003 new_node.parent = parent
1004 new_node.node_index = old_node.node_index
1005 children = parent.children
1006 i = children.index(old_node)
1007 children[i] = new_node
1008 fields = getattr(old_node, '_fields', None)
1009 if fields:
1010 for field in fields:
1011 field = getattr(old_node, field)
1012 if field == old_node:
1013 setattr(old_node, field, new_node)
1014 break
1015 #@+node:ekr.20191225055626.1: *4* function: replace_token
1016 def replace_token(token: "Token", kind: str, value: str) -> None:
1017 """Replace kind and value of the given token."""
1018 if token.kind in ('endmarker', 'killed'):
1019 return
1020 token.kind = kind
1021 token.value = value
1022 token.node = None # Should be filled later.
1023 #@-others
1024#@+node:ekr.20191027072910.1: ** Exception classes
1025class AssignLinksError(Exception):
1026 """Assigning links to ast nodes failed."""
1029class AstNotEqual(Exception):
1030 """The two given AST's are not equivalent."""
1032class BeautifyError(Exception):
1033 """Leading tabs found."""
1036class FailFast(Exception):
1037 """Abort tests in TestRunner class."""
1038#@+node:ekr.20220402062255.1: ** Classes
1039#@+node:ekr.20141012064706.18390: *3* class AstDumper
1040class AstDumper: # pragma: no cover
1041 """A class supporting various kinds of dumps of ast nodes."""
1042 #@+others
1043 #@+node:ekr.20191112033445.1: *4* dumper.dump_tree & helper
1044 def dump_tree(self, tokens: List["Token"], tree: Node) -> str:
1045 """Briefly show a tree, properly indented."""
1046 self.tokens = tokens
1047 result = [self.show_header()]
1048 self.dump_tree_and_links_helper(tree, 0, result)
1049 return ''.join(result)
1050 #@+node:ekr.20191125035321.1: *5* dumper.dump_tree_and_links_helper
1051 def dump_tree_and_links_helper(self, node: Node, level: int, result: List[str]) -> None:
1052 """Return the list of lines in result."""
1053 if node is None:
1054 return
1055 # Let block.
1056 indent = ' ' * 2 * level
1057 children: List[ast.AST] = getattr(node, 'children', [])
1058 node_s = self.compute_node_string(node, level)
1059 # Dump...
1060 if isinstance(node, (list, tuple)):
1061 for z in node:
1062 self.dump_tree_and_links_helper(z, level, result)
1063 elif isinstance(node, str):
1064 result.append(f"{indent}{node.__class__.__name__:>8}:{node}\n")
1065 elif isinstance(node, ast.AST):
1066 # Node and parent.
1067 result.append(node_s)
1068 # Children.
1069 for z in children:
1070 self.dump_tree_and_links_helper(z, level + 1, result)
1071 else:
1072 result.append(node_s)
1073 #@+node:ekr.20191125035600.1: *4* dumper.compute_node_string & helpers
1074 def compute_node_string(self, node: Node, level: int) -> str:
1075 """Return a string summarizing the node."""
1076 indent = ' ' * 2 * level
1077 parent = getattr(node, 'parent', None)
1078 node_id = getattr(node, 'node_index', '??')
1079 parent_id = getattr(parent, 'node_index', '??')
1080 parent_s = f"{parent_id:>3}.{parent.__class__.__name__} " if parent else ''
1081 class_name = node.__class__.__name__
1082 descriptor_s = f"{node_id}.{class_name}: " + self.show_fields(
1083 class_name, node, 30)
1084 tokens_s = self.show_tokens(node, 70, 100)
1085 lines = self.show_line_range(node)
1086 full_s1 = f"{parent_s:<16} {lines:<10} {indent}{descriptor_s} "
1087 node_s = f"{full_s1:<62} {tokens_s}\n"
1088 return node_s
1089 #@+node:ekr.20191113223424.1: *5* dumper.show_fields
1090 def show_fields(self, class_name: str, node: Node, truncate_n: int) -> str:
1091 """Return a string showing interesting fields of the node."""
1092 val = ''
1093 if class_name == 'JoinedStr':
1094 values = node.values
1095 assert isinstance(values, list)
1096 # Str tokens may represent *concatenated* strings.
1097 results = []
1098 fstrings, strings = 0, 0
1099 for z in values:
1100 assert isinstance(z, (ast.FormattedValue, ast.Str))
1101 if isinstance(z, ast.Str):
1102 results.append(z.s)
1103 strings += 1
1104 else:
1105 results.append(z.__class__.__name__)
1106 fstrings += 1
1107 val = f"{strings} str, {fstrings} f-str"
1108 elif class_name == 'keyword':
1109 if isinstance(node.value, ast.Str):
1110 val = f"arg={node.arg}..Str.value.s={node.value.s}"
1111 elif isinstance(node.value, ast.Name):
1112 val = f"arg={node.arg}..Name.value.id={node.value.id}"
1113 else:
1114 val = f"arg={node.arg}..value={node.value.__class__.__name__}"
1115 elif class_name == 'Name':
1116 val = f"id={node.id!r}"
1117 elif class_name == 'NameConstant':
1118 val = f"value={node.value!r}"
1119 elif class_name == 'Num':
1120 val = f"n={node.n}"
1121 elif class_name == 'Starred':
1122 if isinstance(node.value, ast.Str):
1123 val = f"s={node.value.s}"
1124 elif isinstance(node.value, ast.Name):
1125 val = f"id={node.value.id}"
1126 else:
1127 val = f"s={node.value.__class__.__name__}"
1128 elif class_name == 'Str':
1129 val = f"s={node.s!r}"
1130 elif class_name in ('AugAssign', 'BinOp', 'BoolOp', 'UnaryOp'): # IfExp
1131 name = node.op.__class__.__name__
1132 val = f"op={_op_names.get(name, name)}"
1133 elif class_name == 'Compare':
1134 ops = ','.join([op_name(z) for z in node.ops])
1135 val = f"ops='{ops}'"
1136 else:
1137 val = ''
1138 return g.truncate(val, truncate_n)
1139 #@+node:ekr.20191114054726.1: *5* dumper.show_line_range
1140 def show_line_range(self, node: Node) -> str:
1142 token_list = get_node_token_list(node, self.tokens)
1143 if not token_list:
1144 return ''
1145 min_ = min([z.line_number for z in token_list])
1146 max_ = max([z.line_number for z in token_list])
1147 return f"{min_}" if min_ == max_ else f"{min_}..{max_}"
1148 #@+node:ekr.20191113223425.1: *5* dumper.show_tokens
1149 def show_tokens(self, node: Node, n: int, m: int, show_cruft: bool=False) -> str:
1150 """
1151 Return a string showing node.token_list.
1153 Split the result if n + len(result) > m
1154 """
1155 token_list = get_node_token_list(node, self.tokens)
1156 result = []
1157 for z in token_list:
1158 val = None
1159 if z.kind == 'comment':
1160 if show_cruft:
1161 val = g.truncate(z.value, 10) # Short is good.
1162 result.append(f"{z.kind}.{z.index}({val})")
1163 elif z.kind == 'name':
1164 val = g.truncate(z.value, 20)
1165 result.append(f"{z.kind}.{z.index}({val})")
1166 elif z.kind == 'newline':
1167 # result.append(f"{z.kind}.{z.index}({z.line_number}:{len(z.line)})")
1168 result.append(f"{z.kind}.{z.index}")
1169 elif z.kind == 'number':
1170 result.append(f"{z.kind}.{z.index}({z.value})")
1171 elif z.kind == 'op':
1172 if z.value not in ',()' or show_cruft:
1173 result.append(f"{z.kind}.{z.index}({z.value})")
1174 elif z.kind == 'string':
1175 val = g.truncate(z.value, 30)
1176 result.append(f"{z.kind}.{z.index}({val})")
1177 elif z.kind == 'ws':
1178 if show_cruft:
1179 result.append(f"{z.kind}.{z.index}({len(z.value)})")
1180 else:
1181 # Indent, dedent, encoding, etc.
1182 # Don't put a blank.
1183 continue
1184 if result and result[-1] != ' ':
1185 result.append(' ')
1186 #
1187 # split the line if it is too long.
1188 # g.printObj(result, tag='show_tokens')
1189 if 1:
1190 return ''.join(result)
1191 line, lines = [], []
1192 for r in result:
1193 line.append(r)
1194 if n + len(''.join(line)) >= m:
1195 lines.append(''.join(line))
1196 line = []
1197 lines.append(''.join(line))
1198 pad = '\n' + ' ' * n
1199 return pad.join(lines)
1200 #@+node:ekr.20191110165235.5: *4* dumper.show_header
1201 def show_header(self) -> str:
1202 """Return a header string, but only the fist time."""
1203 return (
1204 f"{'parent':<16} {'lines':<10} {'node':<34} {'tokens'}\n"
1205 f"{'======':<16} {'=====':<10} {'====':<34} {'======'}\n")
1206 #@+node:ekr.20141012064706.18392: *4* dumper.dump_ast & helper
1207 annotate_fields = False
1208 include_attributes = False
1209 indent_ws = ' '
1211 def dump_ast(self, node: Node, level: int=0) -> str:
1212 """
1213 Dump an ast tree. Adapted from ast.dump.
1214 """
1215 sep1 = '\n%s' % (self.indent_ws * (level + 1))
1216 if isinstance(node, ast.AST):
1217 fields = [(a, self.dump_ast(b, level + 1)) for a, b in self.get_fields(node)]
1218 if self.include_attributes and node._attributes:
1219 fields.extend([(a, self.dump_ast(getattr(node, a), level + 1))
1220 for a in node._attributes])
1221 if self.annotate_fields:
1222 aList = ['%s=%s' % (a, b) for a, b in fields]
1223 else:
1224 aList = [b for a, b in fields]
1225 name = node.__class__.__name__
1226 sep = '' if len(aList) <= 1 else sep1
1227 return '%s(%s%s)' % (name, sep, sep1.join(aList))
1228 if isinstance(node, list):
1229 sep = sep1
1230 return 'LIST[%s]' % ''.join(
1231 ['%s%s' % (sep, self.dump_ast(z, level + 1)) for z in node])
1232 return repr(node)
1233 #@+node:ekr.20141012064706.18393: *5* dumper.get_fields
1234 def get_fields(self, node: Node) -> Generator:
1236 return (
1237 (a, b) for a, b in ast.iter_fields(node)
1238 if a not in ['ctx',] and b not in (None, [])
1239 )
1240 #@-others
1241#@+node:ekr.20191222083453.1: *3* class Fstringify
1242class Fstringify:
1243 """A class to fstringify files."""
1245 silent = True # for pytest. Defined in all entries.
1246 line_number = 0
1247 line = ''
1249 #@+others
1250 #@+node:ekr.20191222083947.1: *4* fs.fstringify
1251 def fstringify(self, contents: str, filename: str, tokens: List["Token"], tree: Node) -> str:
1252 """
1253 Fstringify.fstringify:
1255 f-stringify the sources given by (tokens, tree).
1257 Return the resulting string.
1258 """
1259 self.filename = filename
1260 self.tokens = tokens
1261 self.tree = tree
1262 # Prepass: reassign tokens.
1263 ReassignTokens().reassign(filename, tokens, tree)
1264 # Main pass.
1265 for node in ast.walk(tree):
1266 if (
1267 isinstance(node, ast.BinOp)
1268 and op_name(node.op) == '%'
1269 and isinstance(node.left, ast.Str)
1270 ):
1271 self.make_fstring(node)
1272 results = tokens_to_string(self.tokens)
1273 return results
1274 #@+node:ekr.20200103054101.1: *4* fs.fstringify_file (entry)
1275 def fstringify_file(self, filename: str) -> bool: # pragma: no cover
1276 """
1277 Fstringify.fstringify_file.
1279 The entry point for the fstringify-file command.
1281 f-stringify the given external file with the Fstrinfify class.
1283 Return True if the file was changed.
1284 """
1285 tag = 'fstringify-file'
1286 self.filename = filename
1287 self.silent = False
1288 tog = TokenOrderGenerator()
1289 try:
1290 contents, encoding, tokens, tree = tog.init_from_file(filename)
1291 if not contents or not tokens or not tree:
1292 print(f"{tag}: Can not fstringify: {filename}")
1293 return False
1294 results = self.fstringify(contents, filename, tokens, tree)
1295 except Exception as e:
1296 print(e)
1297 return False
1298 # Something besides newlines must change.
1299 changed = regularize_nls(contents) != regularize_nls(results)
1300 status = 'Wrote' if changed else 'Unchanged'
1301 print(f"{tag}: {status:>9}: {filename}")
1302 if changed:
1303 write_file(filename, results, encoding=encoding)
1304 return changed
1305 #@+node:ekr.20200103065728.1: *4* fs.fstringify_file_diff (entry)
1306 def fstringify_file_diff(self, filename: str) -> bool: # pragma: no cover
1307 """
1308 Fstringify.fstringify_file_diff.
1310 The entry point for the diff-fstringify-file command.
1312 Print the diffs that would resulf from the fstringify-file command.
1314 Return True if the file would be changed.
1315 """
1316 tag = 'diff-fstringify-file'
1317 self.filename = filename
1318 self.silent = False
1319 tog = TokenOrderGenerator()
1320 try:
1321 contents, encoding, tokens, tree = tog.init_from_file(filename)
1322 if not contents or not tokens or not tree:
1323 return False
1324 results = self.fstringify(contents, filename, tokens, tree)
1325 except Exception as e:
1326 print(e)
1327 return False
1328 # Something besides newlines must change.
1329 changed = regularize_nls(contents) != regularize_nls(results)
1330 if changed:
1331 show_diffs(contents, results, filename=filename)
1332 else:
1333 print(f"{tag}: Unchanged: {filename}")
1334 return changed
1335 #@+node:ekr.20200112060218.1: *4* fs.fstringify_file_silent (entry)
1336 def fstringify_file_silent(self, filename: str) -> bool: # pragma: no cover
1337 """
1338 Fstringify.fstringify_file_silent.
1340 The entry point for the silent-fstringify-file command.
1342 fstringify the given file, suppressing all but serious error messages.
1344 Return True if the file would be changed.
1345 """
1346 self.filename = filename
1347 self.silent = True
1348 tog = TokenOrderGenerator()
1349 try:
1350 contents, encoding, tokens, tree = tog.init_from_file(filename)
1351 if not contents or not tokens or not tree:
1352 return False
1353 results = self.fstringify(contents, filename, tokens, tree)
1354 except Exception as e:
1355 print(e)
1356 return False
1357 # Something besides newlines must change.
1358 changed = regularize_nls(contents) != regularize_nls(results)
1359 status = 'Wrote' if changed else 'Unchanged'
1360 # Write the results.
1361 print(f"{status:>9}: {filename}")
1362 if changed:
1363 write_file(filename, results, encoding=encoding)
1364 return changed
1365 #@+node:ekr.20191222095754.1: *4* fs.make_fstring & helpers
1366 def make_fstring(self, node: Node) -> None:
1367 """
1368 node is BinOp node representing an '%' operator.
1369 node.left is an ast.Str node.
1370 node.right reprsents the RHS of the '%' operator.
1372 Convert this tree to an f-string, if possible.
1373 Replace the node's entire tree with a new ast.Str node.
1374 Replace all the relevant tokens with a single new 'string' token.
1375 """
1376 trace = False
1377 assert isinstance(node.left, ast.Str), (repr(node.left), g.callers())
1378 # Careful: use the tokens, not Str.s. This preserves spelling.
1379 lt_token_list = get_node_token_list(node.left, self.tokens)
1380 if not lt_token_list: # pragma: no cover
1381 print('')
1382 g.trace('Error: no token list in Str')
1383 dump_tree(self.tokens, node)
1384 print('')
1385 return
1386 lt_s = tokens_to_string(lt_token_list)
1387 if trace:
1388 g.trace('lt_s:', lt_s) # pragma: no cover
1389 # Get the RHS values, a list of token lists.
1390 values = self.scan_rhs(node.right)
1391 if trace: # pragma: no cover
1392 for i, z in enumerate(values):
1393 dump_tokens(z, tag=f"RHS value {i}")
1394 # Compute rt_s, self.line and self.line_number for later messages.
1395 token0 = lt_token_list[0]
1396 self.line_number = token0.line_number
1397 self.line = token0.line.strip()
1398 rt_s = ''.join(tokens_to_string(z) for z in values)
1399 # Get the % specs in the LHS string.
1400 specs = self.scan_format_string(lt_s)
1401 if len(values) != len(specs): # pragma: no cover
1402 self.message(
1403 f"can't create f-fstring: {lt_s!r}\n"
1404 f":f-string mismatch: "
1405 f"{len(values)} value{g.plural(len(values))}, "
1406 f"{len(specs)} spec{g.plural(len(specs))}")
1407 return
1408 # Replace specs with values.
1409 results = self.substitute_values(lt_s, specs, values)
1410 result = self.compute_result(lt_s, results)
1411 if not result:
1412 return
1413 # Remove whitespace before ! and :.
1414 result = self.clean_ws(result)
1415 # Show the results
1416 if trace: # pragma: no cover
1417 before = (lt_s + ' % ' + rt_s).replace('\n', '<NL>')
1418 after = result.replace('\n', '<NL>')
1419 self.message(
1420 f"trace:\n"
1421 f":from: {before!s}\n"
1422 f": to: {after!s}")
1423 # Adjust the tree and the token list.
1424 self.replace(node, result, values)
1425 #@+node:ekr.20191222102831.3: *5* fs.clean_ws
1426 ws_pat = re.compile(r'(\s+)([:!][0-9]\})')
1428 def clean_ws(self, s: str) -> str:
1429 """Carefully remove whitespace before ! and : specifiers."""
1430 s = re.sub(self.ws_pat, r'\2', s)
1431 return s
1432 #@+node:ekr.20191222102831.4: *5* fs.compute_result & helpers
1433 def compute_result(self, lt_s: str, tokens: List["Token"]) -> str:
1434 """
1435 Create the final result, with various kinds of munges.
1437 Return the result string, or None if there are errors.
1438 """
1439 # Fail if there is a backslash within { and }.
1440 if not self.check_back_slashes(lt_s, tokens):
1441 return None # pragma: no cover
1442 # Ensure consistent quotes.
1443 if not self.change_quotes(lt_s, tokens):
1444 return None # pragma: no cover
1445 return tokens_to_string(tokens)
1446 #@+node:ekr.20200215074309.1: *6* fs.check_back_slashes
1447 def check_back_slashes(self, lt_s: str, tokens: List["Token"]) -> bool:
1448 """
1449 Return False if any backslash appears with an {} expression.
1451 Tokens is a list of lokens on the RHS.
1452 """
1453 count = 0
1454 for z in tokens:
1455 if z.kind == 'op':
1456 if z.value == '{':
1457 count += 1
1458 elif z.value == '}':
1459 count -= 1
1460 if (count % 2) == 1 and '\\' in z.value:
1461 if not self.silent:
1462 self.message( # pragma: no cover (silent during unit tests)
1463 f"can't create f-fstring: {lt_s!r}\n"
1464 f":backslash in {{expr}}:")
1465 return False
1466 return True
1467 #@+node:ekr.20191222102831.7: *6* fs.change_quotes
1468 def change_quotes(self, lt_s: str, aList: List[Any]) -> bool:
1469 """
1470 Carefully check quotes in all "inner" tokens as necessary.
1472 Return False if the f-string would contain backslashes.
1474 We expect the following "outer" tokens.
1476 aList[0]: ('string', 'f')
1477 aList[1]: ('string', a single or double quote.
1478 aList[-1]: ('string', a single or double quote matching aList[1])
1479 """
1480 # Sanity checks.
1481 if len(aList) < 4:
1482 return True # pragma: no cover (defensive)
1483 if not lt_s: # pragma: no cover (defensive)
1484 self.message("can't create f-fstring: no lt_s!")
1485 return False
1486 delim = lt_s[0]
1487 # Check tokens 0, 1 and -1.
1488 token0 = aList[0]
1489 token1 = aList[1]
1490 token_last = aList[-1]
1491 for token in token0, token1, token_last:
1492 # These are the only kinds of tokens we expect to generate.
1493 ok = (
1494 token.kind == 'string' or
1495 token.kind == 'op' and token.value in '{}')
1496 if not ok: # pragma: no cover (defensive)
1497 self.message(
1498 f"unexpected token: {token.kind} {token.value}\n"
1499 f": lt_s: {lt_s!r}")
1500 return False
1501 # These checks are important...
1502 if token0.value != 'f':
1503 return False # pragma: no cover (defensive)
1504 val1 = token1.value
1505 if delim != val1:
1506 return False # pragma: no cover (defensive)
1507 val_last = token_last.value
1508 if delim != val_last:
1509 return False # pragma: no cover (defensive)
1510 #
1511 # Check for conflicting delims, preferring f"..." to f'...'.
1512 for delim in ('"', "'"):
1513 aList[1] = aList[-1] = Token('string', delim)
1514 for z in aList[2:-1]:
1515 if delim in z.value:
1516 break
1517 else:
1518 return True
1519 if not self.silent: # pragma: no cover (silent unit test)
1520 self.message(
1521 f"can't create f-fstring: {lt_s!r}\n"
1522 f": conflicting delims:")
1523 return False
1524 #@+node:ekr.20191222102831.6: *5* fs.munge_spec
1525 def munge_spec(self, spec: str) -> Tuple[str, str]:
1526 """
1527 Return (head, tail).
1529 The format is spec !head:tail or :tail
1531 Example specs: s2, r3
1532 """
1533 # To do: handle more specs.
1534 head, tail = [], []
1535 if spec.startswith('+'):
1536 pass # Leave it alone!
1537 elif spec.startswith('-'):
1538 tail.append('>')
1539 spec = spec[1:]
1540 if spec.endswith('s'):
1541 spec = spec[:-1]
1542 if spec.endswith('r'):
1543 head.append('r')
1544 spec = spec[:-1]
1545 tail_s = ''.join(tail) + spec
1546 head_s = ''.join(head)
1547 return head_s, tail_s
1548 #@+node:ekr.20191222102831.9: *5* fs.scan_format_string
1549 # format_spec ::= [[fill]align][sign][#][0][width][,][.precision][type]
1550 # fill ::= <any character>
1551 # align ::= "<" | ">" | "=" | "^"
1552 # sign ::= "+" | "-" | " "
1553 # width ::= integer
1554 # precision ::= integer
1555 # type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
1557 format_pat = re.compile(r'%(([+-]?[0-9]*(\.)?[0.9]*)*[bcdeEfFgGnoxrsX]?)')
1559 def scan_format_string(self, s: str) -> List[re.Match]:
1560 """Scan the format string s, returning a list match objects."""
1561 result = list(re.finditer(self.format_pat, s))
1562 return result
1563 #@+node:ekr.20191222104224.1: *5* fs.scan_rhs
1564 def scan_rhs(self, node: Node) -> List[Any]:
1565 """
1566 Scan the right-hand side of a potential f-string.
1568 Return a list of the token lists for each element.
1569 """
1570 trace = False
1571 # First, Try the most common cases.
1572 if isinstance(node, ast.Str):
1573 token_list = get_node_token_list(node, self.tokens)
1574 return [token_list]
1575 if isinstance(node, (list, tuple, ast.Tuple)):
1576 result = []
1577 elts = node.elts if isinstance(node, ast.Tuple) else node
1578 for i, elt in enumerate(elts):
1579 tokens = tokens_for_node(self.filename, elt, self.tokens)
1580 result.append(tokens)
1581 if trace: # pragma: no cover
1582 g.trace(f"item: {i}: {elt.__class__.__name__}")
1583 g.printObj(tokens, tag=f"Tokens for item {i}")
1584 return result
1585 # Now we expect only one result.
1586 tokens = tokens_for_node(self.filename, node, self.tokens)
1587 return [tokens]
1588 #@+node:ekr.20191226155316.1: *5* fs.substitute_values
1589 def substitute_values(self, lt_s: str, specs: List[re.Match], values: List) -> List["Token"]:
1590 """
1591 Replace specifiers with values in lt_s string.
1593 Double { and } as needed.
1594 """
1595 i, results = 0, [Token('string', 'f')]
1596 for spec_i, m in enumerate(specs):
1597 value = tokens_to_string(values[spec_i])
1598 start, end, spec = m.start(0), m.end(0), m.group(1)
1599 if start > i:
1600 val = lt_s[i:start].replace('{', '{{').replace('}', '}}')
1601 results.append(Token('string', val[0]))
1602 results.append(Token('string', val[1:]))
1603 head, tail = self.munge_spec(spec)
1604 results.append(Token('op', '{'))
1605 results.append(Token('string', value))
1606 if head:
1607 results.append(Token('string', '!'))
1608 results.append(Token('string', head))
1609 if tail:
1610 results.append(Token('string', ':'))
1611 results.append(Token('string', tail))
1612 results.append(Token('op', '}'))
1613 i = end
1614 # Add the tail.
1615 tail = lt_s[i:]
1616 if tail:
1617 tail = tail.replace('{', '{{').replace('}', '}}')
1618 results.append(Token('string', tail[:-1]))
1619 results.append(Token('string', tail[-1]))
1620 return results
1621 #@+node:ekr.20200214142019.1: *4* fs.message
1622 def message(self, message: str) -> None: # pragma: no cover.
1623 """
1624 Print one or more message lines aligned on the first colon of the message.
1625 """
1626 # Print a leading blank line.
1627 print('')
1628 # Calculate the padding.
1629 lines = g.splitLines(message)
1630 pad = max(lines[0].find(':'), 30)
1631 # Print the first line.
1632 z = lines[0]
1633 i = z.find(':')
1634 if i == -1:
1635 print(z.rstrip())
1636 else:
1637 print(f"{z[:i+2].strip():>{pad+1}} {z[i+2:].strip()}")
1638 # Print the remaining message lines.
1639 for z in lines[1:]:
1640 if z.startswith('<'):
1641 # Print left aligned.
1642 print(z[1:].strip())
1643 elif z.startswith(':') and -1 < z[1:].find(':') <= pad:
1644 # Align with the first line.
1645 i = z[1:].find(':')
1646 print(f"{z[1:i+2].strip():>{pad+1}} {z[i+2:].strip()}")
1647 elif z.startswith('>'):
1648 # Align after the aligning colon.
1649 print(f"{' ':>{pad+2}}{z[1:].strip()}")
1650 else:
1651 # Default: Put the entire line after the aligning colon.
1652 print(f"{' ':>{pad+2}}{z.strip()}")
1653 # Print the standard message lines.
1654 file_s = f"{'file':>{pad}}"
1655 ln_n_s = f"{'line number':>{pad}}"
1656 line_s = f"{'line':>{pad}}"
1657 print(
1658 f"{file_s}: {self.filename}\n"
1659 f"{ln_n_s}: {self.line_number}\n"
1660 f"{line_s}: {self.line!r}")
1661 #@+node:ekr.20191225054848.1: *4* fs.replace
1662 def replace(self, node: Node, s: str, values: List["Token"]) -> None:
1663 """
1664 Replace node with an ast.Str node for s.
1665 Replace all tokens in the range of values with a single 'string' node.
1666 """
1667 # Replace the tokens...
1668 tokens = tokens_for_node(self.filename, node, self.tokens)
1669 i1 = i = tokens[0].index
1670 replace_token(self.tokens[i], 'string', s)
1671 j = 1
1672 while j < len(tokens):
1673 replace_token(self.tokens[i1 + j], 'killed', '')
1674 j += 1
1675 # Replace the node.
1676 new_node = ast.Str()
1677 new_node.s = s
1678 replace_node(new_node, node)
1679 # Update the token.
1680 token = self.tokens[i1]
1681 token.node = new_node
1682 # Update the token list.
1683 add_token_to_token_list(token, new_node)
1684 #@-others
1685#@+node:ekr.20220330191947.1: *3* class IterativeTokenGenerator
1686class IterativeTokenGenerator:
1687 """
1688 Self-contained iterative token syncing class. It shows how to traverse
1689 any tree with neither recursion nor iterators.
1691 This class is almost exactly as fast as the TokenOrderGenerator class.
1693 This class is another curio: Leo does not use this code.
1695 The main_loop method executes **actions**: (method, argument) tuples.
1697 The key idea: visitors (and visit), never execute code directly.
1698 Instead, they queue methods to be executed in the main loop.
1700 *Important*: find_next_significant_token must be called only *after*
1701 actions have eaten all previous tokens. So do_If (and other visitors)
1702 must queue up **helper actions** for later (delayed) execution.
1703 """
1705 begin_end_stack: List[str] = [] # A stack of node names.
1706 n_nodes = 0 # The number of nodes that have been visited.
1707 node = None # The current node.
1708 node_index = 0 # The index into the node_stack.
1709 node_stack: List[ast.AST] = [] # The stack of parent nodes.
1711 #@+others
1712 #@+node:ekr.20220402095550.1: *4* iterative: Init...
1713 # Same as in the TokenOrderGenerator class.
1714 #@+node:ekr.20220402095550.2: *5* iterative.balance_tokens
1715 def balance_tokens(self, tokens: List["Token"]) -> int:
1716 """
1717 TOG.balance_tokens.
1719 Insert two-way links between matching paren tokens.
1720 """
1721 count, stack = 0, []
1722 for token in tokens:
1723 if token.kind == 'op':
1724 if token.value == '(':
1725 count += 1
1726 stack.append(token.index)
1727 if token.value == ')':
1728 if stack:
1729 index = stack.pop()
1730 tokens[index].matching_paren = token.index
1731 tokens[token.index].matching_paren = index
1732 else: # pragma: no cover
1733 g.trace(f"unmatched ')' at index {token.index}")
1734 if stack: # pragma: no cover
1735 g.trace("unmatched '(' at {','.join(stack)}")
1736 return count
1737 #@+node:ekr.20220402095550.3: *5* iterative.create_links (changed)
1738 def create_links(self, tokens: List["Token"], tree: Node, file_name: str='') -> List:
1739 """
1740 A generator creates two-way links between the given tokens and ast-tree.
1742 Callers should call this generator with list(tog.create_links(...))
1744 The sync_tokens method creates the links and verifies that the resulting
1745 tree traversal generates exactly the given tokens in exact order.
1747 tokens: the list of Token instances for the input.
1748 Created by make_tokens().
1749 tree: the ast tree for the input.
1750 Created by parse_ast().
1751 """
1752 # Init all ivars.
1753 self.file_name = file_name # For tests.
1754 self.node = None # The node being visited.
1755 self.tokens = tokens # The immutable list of input tokens.
1756 self.tree = tree # The tree of ast.AST nodes.
1757 # Traverse the tree.
1758 self.main_loop(tree)
1759 # Ensure that all tokens are patched.
1760 self.node = tree
1761 self.token(('endmarker', ''))
1762 # Return [] for compatibility with legacy code: list(tog.create_links).
1763 return []
1764 #@+node:ekr.20220402095550.4: *5* iterative.init_from_file
1765 def init_from_file(self, filename: str) -> Tuple[str, str, List["Token"], Node]: # pragma: no cover
1766 """
1767 Create the tokens and ast tree for the given file.
1768 Create links between tokens and the parse tree.
1769 Return (contents, encoding, tokens, tree).
1770 """
1771 self.filename = filename
1772 encoding, contents = read_file_with_encoding(filename)
1773 if not contents:
1774 return None, None, None, None
1775 self.tokens = tokens = make_tokens(contents)
1776 self.tree = tree = parse_ast(contents)
1777 self.create_links(tokens, tree)
1778 return contents, encoding, tokens, tree
1779 #@+node:ekr.20220402095550.5: *5* iterative.init_from_string
1780 def init_from_string(self, contents: str, filename: str) -> Tuple[List["Token"], Node]: # pragma: no cover
1781 """
1782 Tokenize, parse and create links in the contents string.
1784 Return (tokens, tree).
1785 """
1786 self.filename = filename
1787 self.tokens = tokens = make_tokens(contents)
1788 self.tree = tree = parse_ast(contents)
1789 self.create_links(tokens, tree)
1790 return tokens, tree
1791 #@+node:ekr.20220402094825.1: *4* iterative: Synchronizers...
1792 # These synchronizer methods sync various kinds of tokens to nodes.
1793 #
1794 # These methods are (mostly) the same as in the TokenOrderGenerator class.
1795 #
1796 # Important: The sync_token in this class has a different signature from its TOG counterpart.
1797 # This slight difference makes it difficult to reuse the TOG methods,
1798 # say via monkey-patching.
1799 #
1800 # So I just copied/pasted these methods. This strategy suffices
1801 # to illustrate the ideas presented in this class.
1803 #@+node:ekr.20220402094825.2: *5* iterative.find_next_significant_token
1804 def find_next_significant_token(self) -> Optional["Token"]:
1805 """
1806 Scan from *after* self.tokens[px] looking for the next significant
1807 token.
1809 Return the token, or None. Never change self.px.
1810 """
1811 px = self.px + 1
1812 while px < len(self.tokens):
1813 token = self.tokens[px]
1814 px += 1
1815 if is_significant_token(token):
1816 return token
1817 # This will never happen, because endtoken is significant.
1818 return None # pragma: no cover
1819 #@+node:ekr.20220402094825.3: *5* iterative.set_links
1820 last_statement_node: Optional[Node] = None
1822 def set_links(self, node: Node, token: "Token") -> None:
1823 """Make two-way links between token and the given node."""
1824 # Don't bother assigning comment, comma, parens, ws and endtoken tokens.
1825 if token.kind == 'comment':
1826 # Append the comment to node.comment_list.
1827 comment_list: List["Token"] = getattr(node, 'comment_list', [])
1828 node.comment_list = comment_list + [token]
1829 return
1830 if token.kind in ('endmarker', 'ws'):
1831 return
1832 if token.kind == 'op' and token.value in ',()':
1833 return
1834 # *Always* remember the last statement.
1835 statement = find_statement_node(node)
1836 if statement:
1837 self.last_statement_node = statement
1838 assert not isinstance(self.last_statement_node, ast.Module)
1839 if token.node is not None: # pragma: no cover
1840 line_s = f"line {token.line_number}:"
1841 raise AssignLinksError(
1842 f" file: {self.filename}\n"
1843 f"{line_s:>12} {token.line.strip()}\n"
1844 f"token index: {self.px}\n"
1845 f"token.node is not None\n"
1846 f" token.node: {token.node.__class__.__name__}\n"
1847 f" callers: {g.callers()}")
1848 # Assign newlines to the previous statement node, if any.
1849 if token.kind in ('newline', 'nl'):
1850 # Set an *auxilliary* link for the split/join logic.
1851 # Do *not* set token.node!
1852 token.statement_node = self.last_statement_node
1853 return
1854 if is_significant_token(token):
1855 # Link the token to the ast node.
1856 token.node = node
1857 # Add the token to node's token_list.
1858 add_token_to_token_list(token, node)
1859 #@+node:ekr.20220402094825.4: *5* iterative.sync_name (aka name)
1860 def sync_name(self, val: str) -> None:
1861 aList = val.split('.')
1862 if len(aList) == 1:
1863 self.sync_token(('name', val))
1864 else:
1865 for i, part in enumerate(aList):
1866 self.sync_token(('name', part))
1867 if i < len(aList) - 1:
1868 self.sync_op('.')
1870 name = sync_name # for readability.
1871 #@+node:ekr.20220402094825.5: *5* iterative.sync_op (aka op)
1872 def sync_op(self, val: str) -> None:
1873 """
1874 Sync to the given operator.
1876 val may be '(' or ')' *only* if the parens *will* actually exist in the
1877 token list.
1878 """
1879 self.sync_token(('op', val))
1881 op = sync_op # For readability.
1882 #@+node:ekr.20220402094825.6: *5* iterative.sync_token (aka token)
1883 px = -1 # Index of the previously synced token.
1885 def sync_token(self, data: Tuple[Any, Any]) -> None:
1886 """
1887 Sync to a token whose kind & value are given. The token need not be
1888 significant, but it must be guaranteed to exist in the token list.
1890 The checks in this method constitute a strong, ever-present, unit test.
1892 Scan the tokens *after* px, looking for a token T matching (kind, val).
1893 raise AssignLinksError if a significant token is found that doesn't match T.
1894 Otherwise:
1895 - Create two-way links between all assignable tokens between px and T.
1896 - Create two-way links between T and self.node.
1897 - Advance by updating self.px to point to T.
1898 """
1899 kind, val = data
1900 node, tokens = self.node, self.tokens
1901 assert isinstance(node, ast.AST), repr(node)
1902 # g.trace(
1903 # f"px: {self.px:2} "
1904 # f"node: {node.__class__.__name__:<10} "
1905 # f"kind: {kind:>10}: val: {val!r}")
1906 #
1907 # Step one: Look for token T.
1908 old_px = px = self.px + 1
1909 while px < len(self.tokens):
1910 token = tokens[px]
1911 if (kind, val) == (token.kind, token.value):
1912 break # Success.
1913 if kind == token.kind == 'number':
1914 val = token.value
1915 break # Benign: use the token's value, a string, instead of a number.
1916 if is_significant_token(token): # pragma: no cover
1917 line_s = f"line {token.line_number}:"
1918 val = str(val) # for g.truncate.
1919 raise AssignLinksError(
1920 f" file: {self.filename}\n"
1921 f"{line_s:>12} {token.line.strip()}\n"
1922 f"Looking for: {kind}.{g.truncate(val, 40)!r}\n"
1923 f" found: {token.kind}.{token.value!r}\n"
1924 f"token.index: {token.index}\n")
1925 # Skip the insignificant token.
1926 px += 1
1927 else: # pragma: no cover
1928 val = str(val) # for g.truncate.
1929 raise AssignLinksError(
1930 f" file: {self.filename}\n"
1931 f"Looking for: {kind}.{g.truncate(val, 40)}\n"
1932 f" found: end of token list")
1933 #
1934 # Step two: Assign *secondary* links only for newline tokens.
1935 # Ignore all other non-significant tokens.
1936 while old_px < px:
1937 token = tokens[old_px]
1938 old_px += 1
1939 if token.kind in ('comment', 'newline', 'nl'):
1940 self.set_links(node, token)
1941 #
1942 # Step three: Set links in the found token.
1943 token = tokens[px]
1944 self.set_links(node, token)
1945 #
1946 # Step four: Advance.
1947 self.px = px
1949 token = sync_token # For readability.
1950 #@+node:ekr.20220330164313.1: *4* iterative: Traversal...
1951 #@+node:ekr.20220402094946.2: *5* iterative.enter_node
1952 def enter_node(self, node: Node) -> None:
1953 """Enter a node."""
1954 # Update the stats.
1955 self.n_nodes += 1
1956 # Create parent/child links first, *before* updating self.node.
1957 #
1958 # Don't even *think* about removing the parent/child links.
1959 # The nearest_common_ancestor function depends upon them.
1960 node.parent = self.node
1961 if self.node:
1962 children: List[Node] = getattr(self.node, 'children', [])
1963 children.append(node)
1964 self.node.children = children
1965 # Inject the node_index field.
1966 assert not hasattr(node, 'node_index'), g.callers()
1967 node.node_index = self.node_index
1968 self.node_index += 1
1969 # begin_visitor and end_visitor must be paired.
1970 self.begin_end_stack.append(node.__class__.__name__)
1971 # Push the previous node.
1972 self.node_stack.append(self.node)
1973 # Update self.node *last*.
1974 self.node = node
1975 #@+node:ekr.20220402094946.3: *5* iterative.leave_node
1976 def leave_node(self, node: Node) -> None:
1977 """Leave a visitor."""
1978 # Make *sure* that begin_visitor and end_visitor are paired.
1979 entry_name = self.begin_end_stack.pop()
1980 assert entry_name == node.__class__.__name__, f"{entry_name!r} {node.__class__.__name__}"
1981 assert self.node == node, (repr(self.node), repr(node))
1982 # Restore self.node.
1983 self.node = self.node_stack.pop()
1984 #@+node:ekr.20220330120220.1: *5* iterative.main_loop
1985 def main_loop(self, node: Node) -> None:
1987 func = getattr(self, 'do_' + node.__class__.__name__, None)
1988 if not func: # pragma: no cover (defensive code)
1989 print('main_loop: invalid ast node:', repr(node))
1990 return
1991 exec_list: ActionList = [(func, node)]
1992 while exec_list:
1993 func, arg = exec_list.pop(0)
1994 result = func(arg)
1995 if result:
1996 # Prepend the result, a list of tuples.
1997 assert isinstance(result, list), repr(result)
1998 exec_list[:0] = result
2000 # For debugging...
2001 # try:
2002 # func, arg = data
2003 # if 0:
2004 # func_name = g.truncate(func.__name__, 15)
2005 # print(
2006 # f"{self.node.__class__.__name__:>10}:"
2007 # f"{func_name:>20} "
2008 # f"{arg.__class__.__name__}")
2009 # except ValueError:
2010 # g.trace('BAD DATA', self.node.__class__.__name__)
2011 # if isinstance(data, (list, tuple)):
2012 # for z in data:
2013 # print(data)
2014 # else:
2015 # print(repr(data))
2016 # raise
2017 #@+node:ekr.20220330155314.1: *5* iterative.visit
2018 def visit(self, node: Node) -> ActionList:
2019 """'Visit' an ast node by return a new list of tuples."""
2020 # Keep this trace.
2021 if False: # pragma: no cover
2022 cn = node.__class__.__name__ if node else ' '
2023 caller1, caller2 = g.callers(2).split(',')
2024 g.trace(f"{caller1:>15} {caller2:<14} {cn}")
2025 if node is None:
2026 return []
2027 # More general, more convenient.
2028 if isinstance(node, (list, tuple)):
2029 result = []
2030 for z in node:
2031 if isinstance(z, ast.AST):
2032 result.append((self.visit, z))
2033 else: # pragma: no cover (This might never happen).
2034 # All other fields should contain ints or strings.
2035 assert isinstance(z, (int, str)), z.__class__.__name__
2036 return result
2037 # We *do* want to crash if the visitor doesn't exist.
2038 assert isinstance(node, ast.AST), repr(node)
2039 method = getattr(self, 'do_' + node.__class__.__name__)
2040 # Don't call *anything* here. Just return a new list of tuples.
2041 return [
2042 (self.enter_node, node),
2043 (method, node),
2044 (self.leave_node, node),
2045 ]
2046 #@+node:ekr.20220330133336.1: *4* iterative: Visitors
2047 #@+node:ekr.20220330133336.2: *5* iterative.keyword: not called!
2048 # keyword arguments supplied to call (NULL identifier for **kwargs)
2050 # keyword = (identifier? arg, expr value)
2052 def do_keyword(self, node: Node) -> List: # pragma: no cover
2053 """A keyword arg in an ast.Call."""
2054 # This should never be called.
2055 # iterative.hande_call_arguments calls self.visit(kwarg_arg.value) instead.
2056 filename = getattr(self, 'filename', '<no file>')
2057 raise AssignLinksError(
2058 f"file: {filename}\n"
2059 f"do_keyword should never be called\n"
2060 f"{g.callers(8)}")
2061 #@+node:ekr.20220330133336.3: *5* iterative: Contexts
2062 #@+node:ekr.20220330133336.4: *6* iterative.arg
2063 # arg = (identifier arg, expr? annotation)
2065 def do_arg(self, node: Node) -> ActionList:
2066 """This is one argument of a list of ast.Function or ast.Lambda arguments."""
2068 annotation = getattr(node, 'annotation', None)
2069 result: ActionList = [
2070 (self.name, node.arg),
2071 ]
2072 if annotation:
2073 result.extend([
2074 (self.op, ':'),
2075 (self.visit, annotation),
2076 ])
2077 return result
2079 #@+node:ekr.20220330133336.5: *6* iterative.arguments
2080 # arguments = (
2081 # arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs,
2082 # expr* kw_defaults, arg? kwarg, expr* defaults
2083 # )
2085 def do_arguments(self, node: Node) -> ActionList:
2086 """Arguments to ast.Function or ast.Lambda, **not** ast.Call."""
2087 #
2088 # No need to generate commas anywhere below.
2089 #
2090 # Let block. Some fields may not exist pre Python 3.8.
2091 n_plain = len(node.args) - len(node.defaults)
2092 posonlyargs = getattr(node, 'posonlyargs', [])
2093 vararg = getattr(node, 'vararg', None)
2094 kwonlyargs = getattr(node, 'kwonlyargs', [])
2095 kw_defaults = getattr(node, 'kw_defaults', [])
2096 kwarg = getattr(node, 'kwarg', None)
2097 result: ActionList = []
2098 # 1. Sync the position-only args.
2099 if posonlyargs:
2100 for n, z in enumerate(posonlyargs):
2101 result.append((self.visit, z))
2102 result.append((self.op, '/'))
2103 # 2. Sync all args.
2104 for i, z in enumerate(node.args):
2105 result.append((self.visit, z))
2106 if i >= n_plain:
2107 result.extend([
2108 (self.op, '='),
2109 (self.visit, node.defaults[i - n_plain]),
2110 ])
2111 # 3. Sync the vararg.
2112 if vararg:
2113 result.extend([
2114 (self.op, '*'),
2115 (self.visit, vararg),
2116 ])
2117 # 4. Sync the keyword-only args.
2118 if kwonlyargs:
2119 if not vararg:
2120 result.append((self.op, '*'))
2121 for n, z in enumerate(kwonlyargs):
2122 result.append((self.visit, z))
2123 val = kw_defaults[n]
2124 if val is not None:
2125 result.extend([
2126 (self.op, '='),
2127 (self.visit, val),
2128 ])
2129 # 5. Sync the kwarg.
2130 if kwarg:
2131 result.extend([
2132 (self.op, '**'),
2133 (self.visit, kwarg),
2134 ])
2135 return result
2139 #@+node:ekr.20220330133336.6: *6* iterative.AsyncFunctionDef
2140 # AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list,
2141 # expr? returns)
2143 def do_AsyncFunctionDef(self, node: Node) -> ActionList:
2145 returns = getattr(node, 'returns', None)
2146 result: ActionList = []
2147 # Decorators...
2148 # @{z}\n
2149 for z in node.decorator_list or []:
2150 result.extend([
2151 (self.op, '@'),
2152 (self.visit, z)
2153 ])
2154 # Signature...
2155 # def name(args): -> returns\n
2156 # def name(args):\n
2157 result.extend([
2158 (self.name, 'async'),
2159 (self.name, 'def'),
2160 (self.name, node.name), # A string.
2161 (self.op, '('),
2162 (self.visit, node.args),
2163 (self.op, ')'),
2164 ])
2165 if returns is not None:
2166 result.extend([
2167 (self.op, '->'),
2168 (self.visit, node.returns),
2169 ])
2170 # Body...
2171 result.extend([
2172 (self.op, ':'),
2173 (self.visit, node.body),
2174 ])
2175 return result
2176 #@+node:ekr.20220330133336.7: *6* iterative.ClassDef
2177 def do_ClassDef(self, node: Node) -> ActionList:
2179 result: ActionList = []
2180 for z in node.decorator_list or []:
2181 # @{z}\n
2182 result.extend([
2183 (self.op, '@'),
2184 (self.visit, z),
2185 ])
2186 # class name(bases):\n
2187 result.extend([
2188 (self.name, 'class'),
2189 (self.name, node.name), # A string.
2190 ])
2191 if node.bases:
2192 result.extend([
2193 (self.op, '('),
2194 (self.visit, node.bases),
2195 (self.op, ')'),
2196 ])
2197 result.extend([
2198 (self.op, ':'),
2199 (self.visit, node.body),
2200 ])
2201 return result
2202 #@+node:ekr.20220330133336.8: *6* iterative.FunctionDef
2203 # FunctionDef(
2204 # identifier name, arguments args,
2205 # stmt* body,
2206 # expr* decorator_list,
2207 # expr? returns,
2208 # string? type_comment)
2210 def do_FunctionDef(self, node: Node) -> ActionList:
2212 returns = getattr(node, 'returns', None)
2213 result: ActionList = []
2214 # Decorators...
2215 # @{z}\n
2216 for z in node.decorator_list or []:
2217 result.extend([
2218 (self.op, '@'),
2219 (self.visit, z)
2220 ])
2221 # Signature...
2222 # def name(args): -> returns\n
2223 # def name(args):\n
2224 result.extend([
2225 (self.name, 'def'),
2226 (self.name, node.name), # A string.
2227 (self.op, '('),
2228 (self.visit, node.args),
2229 (self.op, ')'),
2230 ])
2231 if returns is not None:
2232 result.extend([
2233 (self.op, '->'),
2234 (self.visit, node.returns),
2235 ])
2236 # Body...
2237 result.extend([
2238 (self.op, ':'),
2239 (self.visit, node.body),
2240 ])
2241 return result
2242 #@+node:ekr.20220330133336.9: *6* iterative.Interactive
2243 def do_Interactive(self, node: Node) -> ActionList: # pragma: no cover
2245 return [
2246 (self.visit, node.body),
2247 ]
2248 #@+node:ekr.20220330133336.10: *6* iterative.Lambda
2249 def do_Lambda(self, node: Node) -> ActionList:
2251 return [
2252 (self.name, 'lambda'),
2253 (self.visit, node.args),
2254 (self.op, ':'),
2255 (self.visit, node.body),
2256 ]
2258 #@+node:ekr.20220330133336.11: *6* iterative.Module
2259 def do_Module(self, node: Node) -> ActionList:
2261 # Encoding is a non-syncing statement.
2262 return [
2263 (self.visit, node.body),
2264 ]
2265 #@+node:ekr.20220330133336.12: *5* iterative: Expressions
2266 #@+node:ekr.20220330133336.13: *6* iterative.Expr
2267 def do_Expr(self, node: Node) -> ActionList:
2268 """An outer expression."""
2269 # No need to put parentheses.
2270 return [
2271 (self.visit, node.value),
2272 ]
2273 #@+node:ekr.20220330133336.14: *6* iterative.Expression
2274 def do_Expression(self, node: Node) -> ActionList: # pragma: no cover
2275 """An inner expression."""
2276 # No need to put parentheses.
2277 return [
2278 (self.visit, node.body),
2279 ]
2280 #@+node:ekr.20220330133336.15: *6* iterative.GeneratorExp
2281 def do_GeneratorExp(self, node: Node) -> ActionList:
2282 # '<gen %s for %s>' % (elt, ','.join(gens))
2283 # No need to put parentheses or commas.
2284 return [
2285 (self.visit, node.elt),
2286 (self.visit, node.generators),
2287 ]
2288 #@+node:ekr.20220330133336.16: *6* iterative.NamedExpr
2289 # NamedExpr(expr target, expr value)
2291 def do_NamedExpr(self, node: Node) -> ActionList: # Python 3.8+
2293 return [
2294 (self.visit, node.target),
2295 (self.op, ':='),
2296 (self.visit, node.value),
2297 ]
2298 #@+node:ekr.20220402160128.1: *5* iterative: Operands
2299 #@+node:ekr.20220402160128.2: *6* iterative.Attribute
2300 # Attribute(expr value, identifier attr, expr_context ctx)
2302 def do_Attribute(self, node: Node) -> ActionList:
2304 return [
2305 (self.visit, node.value),
2306 (self.op, '.'),
2307 (self.name, node.attr), # A string.
2308 ]
2309 #@+node:ekr.20220402160128.3: *6* iterative.Bytes
2310 def do_Bytes(self, node: Node) -> ActionList:
2312 """
2313 It's invalid to mix bytes and non-bytes literals, so just
2314 advancing to the next 'string' token suffices.
2315 """
2316 token = self.find_next_significant_token()
2317 return [
2318 (self.token, ('string', token.value)),
2319 ]
2320 #@+node:ekr.20220402160128.4: *6* iterative.comprehension
2321 # comprehension = (expr target, expr iter, expr* ifs, int is_async)
2323 def do_comprehension(self, node: Node) -> ActionList:
2325 # No need to put parentheses.
2326 result: ActionList = [
2327 (self.name, 'for'),
2328 (self.visit, node.target), # A name
2329 (self.name, 'in'),
2330 (self.visit, node.iter),
2331 ]
2332 for z in node.ifs or []:
2333 result.extend([
2334 (self.name, 'if'),
2335 (self.visit, z),
2336 ])
2337 return result
2338 #@+node:ekr.20220402160128.5: *6* iterative.Constant
2339 def do_Constant(self, node: Node) -> ActionList: # pragma: no cover
2340 """
2342 https://greentreesnakes.readthedocs.io/en/latest/nodes.html
2344 A constant. The value attribute holds the Python object it represents.
2345 This can be simple types such as a number, string or None, but also
2346 immutable container types (tuples and frozensets) if all of their
2347 elements are constant.
2348 """
2349 # Support Python 3.8.
2350 if node.value is None or isinstance(node.value, bool):
2351 # Weird: return a name!
2352 return [
2353 (self.token, ('name', repr(node.value))),
2354 ]
2355 if node.value == Ellipsis:
2356 return [
2357 (self.op, '...'),
2358 ]
2359 if isinstance(node.value, str):
2360 return self.do_Str(node)
2361 if isinstance(node.value, (int, float)):
2362 return [
2363 (self.token, ('number', repr(node.value))),
2364 ]
2365 if isinstance(node.value, bytes):
2366 return self.do_Bytes(node)
2367 if isinstance(node.value, tuple):
2368 return self.do_Tuple(node)
2369 if isinstance(node.value, frozenset):
2370 return self.do_Set(node)
2371 g.trace('----- Oops -----', repr(node.value), g.callers())
2372 return []
2374 #@+node:ekr.20220402160128.6: *6* iterative.Dict
2375 # Dict(expr* keys, expr* values)
2377 def do_Dict(self, node: Node) -> ActionList:
2379 assert len(node.keys) == len(node.values)
2380 result: ActionList = [
2381 (self.op, '{'),
2382 ]
2383 # No need to put commas.
2384 for i, key in enumerate(node.keys):
2385 key, value = node.keys[i], node.values[i]
2386 result.extend([
2387 (self.visit, key), # a Str node.
2388 (self.op, ':'),
2389 ])
2390 if value is not None:
2391 result.append((self.visit, value))
2392 result.append((self.op, '}'))
2393 return result
2394 #@+node:ekr.20220402160128.7: *6* iterative.DictComp
2395 # DictComp(expr key, expr value, comprehension* generators)
2397 # d2 = {val: key for key, val in d}
2399 def do_DictComp(self, node: Node) -> ActionList:
2401 result: ActionList = [
2402 (self.token, ('op', '{')),
2403 (self.visit, node.key),
2404 (self.op, ':'),
2405 (self.visit, node.value),
2406 ]
2407 for z in node.generators or []:
2408 result.extend([
2409 (self.visit, z),
2410 (self.token, ('op', '}')),
2411 ])
2412 return result
2414 #@+node:ekr.20220402160128.8: *6* iterative.Ellipsis
2415 def do_Ellipsis(self, node: Node) -> ActionList: # pragma: no cover (Does not exist for python 3.8+)
2417 return [
2418 (self.op, '...'),
2419 ]
2420 #@+node:ekr.20220402160128.9: *6* iterative.ExtSlice
2421 # https://docs.python.org/3/reference/expressions.html#slicings
2423 # ExtSlice(slice* dims)
2425 def do_ExtSlice(self, node: Node) -> ActionList: # pragma: no cover (deprecated)
2427 result: ActionList = []
2428 for i, z in enumerate(node.dims):
2429 result.append((self.visit, z))
2430 if i < len(node.dims) - 1:
2431 result.append((self.op, ','))
2432 return result
2433 #@+node:ekr.20220402160128.10: *6* iterative.Index
2434 def do_Index(self, node: Node) -> ActionList: # pragma: no cover (deprecated)
2436 return [
2437 (self.visit, node.value),
2438 ]
2439 #@+node:ekr.20220402160128.11: *6* iterative.FormattedValue: not called!
2440 # FormattedValue(expr value, int? conversion, expr? format_spec)
2442 def do_FormattedValue(self, node: Node) -> ActionList: # pragma: no cover
2443 """
2444 This node represents the *components* of a *single* f-string.
2446 Happily, JoinedStr nodes *also* represent *all* f-strings,
2447 so the TOG should *never visit this node!
2448 """
2449 filename = getattr(self, 'filename', '<no file>')
2450 raise AssignLinksError(
2451 f"file: {filename}\n"
2452 f"do_FormattedValue should never be called")
2454 # This code has no chance of being useful...
2455 # conv = node.conversion
2456 # spec = node.format_spec
2457 # self.visit(node.value)
2458 # if conv is not None:
2459 # self.token('number', conv)
2460 # if spec is not None:
2461 # self.visit(node.format_spec)
2462 #@+node:ekr.20220402160128.12: *6* iterative.JoinedStr & helpers
2463 # JoinedStr(expr* values)
2465 def do_JoinedStr(self, node: Node) -> ActionList:
2466 """
2467 JoinedStr nodes represent at least one f-string and all other strings
2468 concatentated to it.
2470 Analyzing JoinedStr.values would be extremely tricky, for reasons that
2471 need not be explained here.
2473 Instead, we get the tokens *from the token list itself*!
2474 """
2475 return [
2476 (self.token, (z.kind, z.value))
2477 for z in self.get_concatenated_string_tokens()
2478 ]
2479 #@+node:ekr.20220402160128.13: *6* iterative.List
2480 def do_List(self, node: Node) -> ActionList:
2482 # No need to put commas.
2483 return [
2484 (self.op, '['),
2485 (self.visit, node.elts),
2486 (self.op, ']'),
2487 ]
2488 #@+node:ekr.20220402160128.14: *6* iterative.ListComp
2489 # ListComp(expr elt, comprehension* generators)
2491 def do_ListComp(self, node: Node) -> ActionList:
2493 result: ActionList = [
2494 (self.op, '['),
2495 (self.visit, node.elt),
2496 ]
2497 for z in node.generators:
2498 result.append((self.visit, z))
2499 result.append((self.op, ']'))
2500 return result
2501 #@+node:ekr.20220402160128.15: *6* iterative.Name & NameConstant
2502 def do_Name(self, node: Node) -> ActionList:
2504 return [
2505 (self.name, node.id),
2506 ]
2508 def do_NameConstant(self, node: Node) -> ActionList: # pragma: no cover (Does not exist in Python 3.8+)
2510 return [
2511 (self.name, repr(node.value)),
2512 ]
2513 #@+node:ekr.20220402160128.16: *6* iterative.Num
2514 def do_Num(self, node: Node) -> ActionList: # pragma: no cover (Does not exist in Python 3.8+)
2516 return [
2517 (self.token, ('number', node.n)),
2518 ]
2519 #@+node:ekr.20220402160128.17: *6* iterative.Set
2520 # Set(expr* elts)
2522 def do_Set(self, node: Node) -> ActionList:
2524 return [
2525 (self.op, '{'),
2526 (self.visit, node.elts),
2527 (self.op, '}'),
2528 ]
2529 #@+node:ekr.20220402160128.18: *6* iterative.SetComp
2530 # SetComp(expr elt, comprehension* generators)
2532 def do_SetComp(self, node: Node) -> ActionList:
2534 result: ActionList = [
2535 (self.op, '{'),
2536 (self.visit, node.elt),
2537 ]
2538 for z in node.generators or []:
2539 result.append((self.visit, z))
2540 result.append((self.op, '}'))
2541 return result
2542 #@+node:ekr.20220402160128.19: *6* iterative.Slice
2543 # slice = Slice(expr? lower, expr? upper, expr? step)
2545 def do_Slice(self, node: Node) -> ActionList:
2547 lower = getattr(node, 'lower', None)
2548 upper = getattr(node, 'upper', None)
2549 step = getattr(node, 'step', None)
2550 result: ActionList = []
2551 if lower is not None:
2552 result.append((self.visit, lower))
2553 # Always put the colon between upper and lower.
2554 result.append((self.op, ':'))
2555 if upper is not None:
2556 result.append((self.visit, upper))
2557 # Put the second colon if it exists in the token list.
2558 if step is None:
2559 result.append((self.slice_helper, node))
2560 else:
2561 result.extend([
2562 (self.op, ':'),
2563 (self.visit, step),
2564 ])
2565 return result
2567 def slice_helper(self, node: Node) -> ActionList:
2568 """Delayed evaluation!"""
2569 token = self.find_next_significant_token()
2570 if token and token.value == ':':
2571 return [
2572 (self.op, ':'),
2573 ]
2574 return []
2575 #@+node:ekr.20220402160128.20: *6* iterative.Str & helper
2576 def do_Str(self, node: Node) -> ActionList:
2577 """This node represents a string constant."""
2578 # This loop is necessary to handle string concatenation.
2579 return [
2580 (self.token, (z.kind, z.value))
2581 for z in self.get_concatenated_string_tokens()
2582 ]
2584 #@+node:ekr.20220402160128.21: *7* iterative.get_concatenated_tokens
2585 def get_concatenated_string_tokens(self) -> List:
2586 """
2587 Return the next 'string' token and all 'string' tokens concatenated to
2588 it. *Never* update self.px here.
2589 """
2590 trace = False
2591 tag = 'iterative.get_concatenated_string_tokens'
2592 i = self.px
2593 # First, find the next significant token. It should be a string.
2594 i, token = i + 1, None
2595 while i < len(self.tokens):
2596 token = self.tokens[i]
2597 i += 1
2598 if token.kind == 'string':
2599 # Rescan the string.
2600 i -= 1
2601 break
2602 # An error.
2603 if is_significant_token(token): # pragma: no cover
2604 break
2605 # Raise an error if we didn't find the expected 'string' token.
2606 if not token or token.kind != 'string': # pragma: no cover
2607 if not token:
2608 token = self.tokens[-1]
2609 filename = getattr(self, 'filename', '<no filename>')
2610 raise AssignLinksError(
2611 f"\n"
2612 f"{tag}...\n"
2613 f"file: {filename}\n"
2614 f"line: {token.line_number}\n"
2615 f" i: {i}\n"
2616 f"expected 'string' token, got {token!s}")
2617 # Accumulate string tokens.
2618 assert self.tokens[i].kind == 'string'
2619 results = []
2620 while i < len(self.tokens):
2621 token = self.tokens[i]
2622 i += 1
2623 if token.kind == 'string':
2624 results.append(token)
2625 elif token.kind == 'op' or is_significant_token(token):
2626 # Any significant token *or* any op will halt string concatenation.
2627 break
2628 # 'ws', 'nl', 'newline', 'comment', 'indent', 'dedent', etc.
2629 # The (significant) 'endmarker' token ensures we will have result.
2630 assert results
2631 if trace: # pragma: no cover
2632 g.printObj(results, tag=f"{tag}: Results")
2633 return results
2634 #@+node:ekr.20220402160128.22: *6* iterative.Subscript
2635 # Subscript(expr value, slice slice, expr_context ctx)
2637 def do_Subscript(self, node: Node) -> ActionList:
2639 return [
2640 (self.visit, node.value),
2641 (self.op, '['),
2642 (self.visit, node.slice),
2643 (self.op, ']'),
2644 ]
2645 #@+node:ekr.20220402160128.23: *6* iterative.Tuple
2646 # Tuple(expr* elts, expr_context ctx)
2648 def do_Tuple(self, node: Node) -> ActionList:
2650 # Do not call gen_op for parens or commas here.
2651 # They do not necessarily exist in the token list!
2653 return [
2654 (self.visit, node.elts),
2655 ]
2656 #@+node:ekr.20220330133336.40: *5* iterative: Operators
2657 #@+node:ekr.20220330133336.41: *6* iterative.BinOp
2658 def do_BinOp(self, node: Node) -> ActionList:
2660 return [
2661 (self.visit, node.left),
2662 (self.op, op_name(node.op)),
2663 (self.visit, node.right),
2664 ]
2666 #@+node:ekr.20220330133336.42: *6* iterative.BoolOp
2667 # BoolOp(boolop op, expr* values)
2669 def do_BoolOp(self, node: Node) -> ActionList:
2671 result: ActionList = []
2672 op_name_ = op_name(node.op)
2673 for i, z in enumerate(node.values):
2674 result.append((self.visit, z))
2675 if i < len(node.values) - 1:
2676 result.append((self.name, op_name_))
2677 return result
2678 #@+node:ekr.20220330133336.43: *6* iterative.Compare
2679 # Compare(expr left, cmpop* ops, expr* comparators)
2681 def do_Compare(self, node: Node) -> ActionList:
2683 assert len(node.ops) == len(node.comparators)
2684 result: ActionList = [(self.visit, node.left)]
2685 for i, z in enumerate(node.ops):
2686 op_name_ = op_name(node.ops[i])
2687 if op_name_ in ('not in', 'is not'):
2688 for z in op_name_.split(' '):
2689 result.append((self.name, z))
2690 elif op_name_.isalpha():
2691 result.append((self.name, op_name_))
2692 else:
2693 result.append((self.op, op_name_))
2694 result.append((self.visit, node.comparators[i]))
2695 return result
2696 #@+node:ekr.20220330133336.44: *6* iterative.UnaryOp
2697 def do_UnaryOp(self, node: Node) -> ActionList:
2699 op_name_ = op_name(node.op)
2700 result: ActionList = []
2701 if op_name_.isalpha():
2702 result.append((self.name, op_name_))
2703 else:
2704 result.append((self.op, op_name_))
2705 result.append((self.visit, node.operand))
2706 return result
2707 #@+node:ekr.20220330133336.45: *6* iterative.IfExp (ternary operator)
2708 # IfExp(expr test, expr body, expr orelse)
2710 def do_IfExp(self, node: Node) -> ActionList:
2712 #'%s if %s else %s'
2713 return [
2714 (self.visit, node.body),
2715 (self.name, 'if'),
2716 (self.visit, node.test),
2717 (self.name, 'else'),
2718 (self.visit, node.orelse),
2719 ]
2721 #@+node:ekr.20220330133336.46: *5* iterative: Statements
2722 #@+node:ekr.20220330133336.47: *6* iterative.Starred
2723 # Starred(expr value, expr_context ctx)
2725 def do_Starred(self, node: Node) -> ActionList:
2726 """A starred argument to an ast.Call"""
2727 return [
2728 (self.op, '*'),
2729 (self.visit, node.value),
2730 ]
2731 #@+node:ekr.20220330133336.48: *6* iterative.AnnAssign
2732 # AnnAssign(expr target, expr annotation, expr? value, int simple)
2734 def do_AnnAssign(self, node: Node) -> ActionList:
2736 # {node.target}:{node.annotation}={node.value}\n'
2737 result: ActionList = [
2738 (self.visit, node.target),
2739 (self.op, ':'),
2740 (self.visit, node.annotation),
2741 ]
2742 if node.value is not None: # #1851
2743 result.extend([
2744 (self.op, '='),
2745 (self.visit, node.value),
2746 ])
2747 return result
2748 #@+node:ekr.20220330133336.49: *6* iterative.Assert
2749 # Assert(expr test, expr? msg)
2751 def do_Assert(self, node: Node) -> ActionList:
2753 # No need to put parentheses or commas.
2754 msg = getattr(node, 'msg', None)
2755 result: ActionList = [
2756 (self.name, 'assert'),
2757 (self.visit, node.test),
2758 ]
2759 if msg is not None:
2760 result.append((self.visit, node.msg))
2761 return result
2762 #@+node:ekr.20220330133336.50: *6* iterative.Assign
2763 def do_Assign(self, node: Node) -> ActionList:
2765 result: ActionList = []
2766 for z in node.targets:
2767 result.extend([
2768 (self.visit, z),
2769 (self.op, '=')
2770 ])
2771 result.append((self.visit, node.value))
2772 return result
2773 #@+node:ekr.20220330133336.51: *6* iterative.AsyncFor
2774 def do_AsyncFor(self, node: Node) -> ActionList:
2776 # The def line...
2777 # Py 3.8 changes the kind of token.
2778 async_token_type = 'async' if has_async_tokens else 'name'
2779 result: ActionList = [
2780 (self.token, (async_token_type, 'async')),
2781 (self.name, 'for'),
2782 (self.visit, node.target),
2783 (self.name, 'in'),
2784 (self.visit, node.iter),
2785 (self.op, ':'),
2786 # Body...
2787 (self.visit, node.body),
2788 ]
2789 # Else clause...
2790 if node.orelse:
2791 result.extend([
2792 (self.name, 'else'),
2793 (self.op, ':'),
2794 (self.visit, node.orelse),
2795 ])
2796 return result
2797 #@+node:ekr.20220330133336.52: *6* iterative.AsyncWith
2798 def do_AsyncWith(self, node: Node) -> ActionList:
2800 async_token_type = 'async' if has_async_tokens else 'name'
2801 return [
2802 (self.token, (async_token_type, 'async')),
2803 (self.do_With, node),
2804 ]
2805 #@+node:ekr.20220330133336.53: *6* iterative.AugAssign
2806 # AugAssign(expr target, operator op, expr value)
2808 def do_AugAssign(self, node: Node) -> ActionList:
2810 # %s%s=%s\n'
2811 return [
2812 (self.visit, node.target),
2813 (self.op, op_name(node.op) + '='),
2814 (self.visit, node.value),
2815 ]
2816 #@+node:ekr.20220330133336.54: *6* iterative.Await
2817 # Await(expr value)
2819 def do_Await(self, node: Node) -> ActionList:
2821 #'await %s\n'
2822 async_token_type = 'await' if has_async_tokens else 'name'
2823 return [
2824 (self.token, (async_token_type, 'await')),
2825 (self.visit, node.value),
2826 ]
2827 #@+node:ekr.20220330133336.55: *6* iterative.Break
2828 def do_Break(self, node: Node) -> ActionList:
2830 return [
2831 (self.name, 'break'),
2832 ]
2833 #@+node:ekr.20220330133336.56: *6* iterative.Call & helpers
2834 # Call(expr func, expr* args, keyword* keywords)
2836 # Python 3 ast.Call nodes do not have 'starargs' or 'kwargs' fields.
2838 def do_Call(self, node: Node) -> ActionList:
2840 # The calls to op(')') and op('(') do nothing by default.
2841 # No need to generate any commas.
2842 # Subclasses might handle them in an overridden iterative.set_links.
2843 return [
2844 (self.visit, node.func),
2845 (self.op, '('),
2846 (self.handle_call_arguments, node),
2847 (self.op, ')'),
2848 ]
2849 #@+node:ekr.20220330133336.57: *7* iterative.arg_helper
2850 def arg_helper(self, node: Node) -> ActionList:
2851 """
2852 Yield the node, with a special case for strings.
2853 """
2854 result: ActionList = []
2855 if isinstance(node, str):
2856 result.append((self.token, ('name', node)))
2857 else:
2858 result.append((self.visit, node))
2859 return result
2860 #@+node:ekr.20220330133336.58: *7* iterative.handle_call_arguments
2861 def handle_call_arguments(self, node: Node) -> ActionList:
2862 """
2863 Generate arguments in the correct order.
2865 Call(expr func, expr* args, keyword* keywords)
2867 https://docs.python.org/3/reference/expressions.html#calls
2869 Warning: This code will fail on Python 3.8 only for calls
2870 containing kwargs in unexpected places.
2871 """
2872 # *args: in node.args[]: Starred(value=Name(id='args'))
2873 # *[a, 3]: in node.args[]: Starred(value=List(elts=[Name(id='a'), Num(n=3)])
2874 # **kwargs: in node.keywords[]: keyword(arg=None, value=Name(id='kwargs'))
2875 #
2876 # Scan args for *name or *List
2877 args = node.args or []
2878 keywords = node.keywords or []
2880 def get_pos(obj: Any) -> Tuple:
2881 line1 = getattr(obj, 'lineno', None)
2882 col1 = getattr(obj, 'col_offset', None)
2883 return line1, col1, obj
2885 def sort_key(aTuple: Tuple) -> int:
2886 line, col, obj = aTuple
2887 return line * 1000 + col
2889 assert py_version >= (3, 9)
2891 places = [get_pos(z) for z in args + keywords]
2892 places.sort(key=sort_key)
2893 ordered_args = [z[2] for z in places]
2894 result: ActionList = []
2895 for z in ordered_args:
2896 if isinstance(z, ast.Starred):
2897 result.extend([
2898 (self.op, '*'),
2899 (self.visit, z.value),
2900 ])
2901 elif isinstance(z, ast.keyword):
2902 if getattr(z, 'arg', None) is None:
2903 result.extend([
2904 (self.op, '**'),
2905 (self.arg_helper, z.value),
2906 ])
2907 else:
2908 result.extend([
2909 (self.arg_helper, z.arg),
2910 (self.op, '='),
2911 (self.arg_helper, z.value),
2912 ])
2913 else:
2914 result.append((self.arg_helper, z))
2915 return result
2916 #@+node:ekr.20220330133336.59: *6* iterative.Continue
2917 def do_Continue(self, node: Node) -> ActionList:
2919 return [
2920 (self.name, 'continue'),
2921 ]
2922 #@+node:ekr.20220330133336.60: *6* iterative.Delete
2923 def do_Delete(self, node: Node) -> ActionList:
2925 # No need to put commas.
2926 return [
2927 (self.name, 'del'),
2928 (self.visit, node.targets),
2929 ]
2930 #@+node:ekr.20220330133336.61: *6* iterative.ExceptHandler
2931 def do_ExceptHandler(self, node: Node) -> ActionList:
2933 # Except line...
2934 result: ActionList = [
2935 (self.name, 'except'),
2936 ]
2937 if getattr(node, 'type', None):
2938 result.append((self.visit, node.type))
2939 if getattr(node, 'name', None):
2940 result.extend([
2941 (self.name, 'as'),
2942 (self.name, node.name),
2943 ])
2944 result.extend([
2945 (self.op, ':'),
2946 # Body...
2947 (self.visit, node.body),
2948 ])
2949 return result
2950 #@+node:ekr.20220330133336.62: *6* iterative.For
2951 def do_For(self, node: Node) -> ActionList:
2953 result: ActionList = [
2954 # The def line...
2955 (self.name, 'for'),
2956 (self.visit, node.target),
2957 (self.name, 'in'),
2958 (self.visit, node.iter),
2959 (self.op, ':'),
2960 # Body...
2961 (self.visit, node.body),
2962 ]
2963 # Else clause...
2964 if node.orelse:
2965 result.extend([
2966 (self.name, 'else'),
2967 (self.op, ':'),
2968 (self.visit, node.orelse),
2969 ])
2970 return result
2971 #@+node:ekr.20220330133336.63: *6* iterative.Global
2972 # Global(identifier* names)
2974 def do_Global(self, node: Node) -> ActionList:
2976 result = [
2977 (self.name, 'global'),
2978 ]
2979 for z in node.names:
2980 result.append((self.name, z))
2981 return result
2982 #@+node:ekr.20220330133336.64: *6* iterative.If & helpers
2983 # If(expr test, stmt* body, stmt* orelse)
2985 def do_If(self, node: Node) -> ActionList:
2986 #@+<< do_If docstring >>
2987 #@+node:ekr.20220330133336.65: *7* << do_If docstring >>
2988 """
2989 The parse trees for the following are identical!
2991 if 1: if 1:
2992 pass pass
2993 else: elif 2:
2994 if 2: pass
2995 pass
2997 So there is *no* way for the 'if' visitor to disambiguate the above two
2998 cases from the parse tree alone.
3000 Instead, we scan the tokens list for the next 'if', 'else' or 'elif' token.
3001 """
3002 #@-<< do_If docstring >>
3003 # Use the next significant token to distinguish between 'if' and 'elif'.
3004 token = self.find_next_significant_token()
3005 result: ActionList = [
3006 (self.name, token.value),
3007 (self.visit, node.test),
3008 (self.op, ':'),
3009 # Body...
3010 (self.visit, node.body),
3011 ]
3012 # Else and elif clauses...
3013 if node.orelse:
3014 # We *must* delay the evaluation of the else clause.
3015 result.append((self.if_else_helper, node))
3016 return result
3018 def if_else_helper(self, node: Node) -> ActionList:
3019 """Delayed evaluation!"""
3020 token = self.find_next_significant_token()
3021 if token.value == 'else':
3022 return [
3023 (self.name, 'else'),
3024 (self.op, ':'),
3025 (self.visit, node.orelse),
3026 ]
3027 return [
3028 (self.visit, node.orelse),
3029 ]
3030 #@+node:ekr.20220330133336.66: *6* iterative.Import & helper
3031 def do_Import(self, node: Node) -> ActionList:
3033 result: ActionList = [
3034 (self.name, 'import'),
3035 ]
3036 for alias in node.names:
3037 result.append((self.name, alias.name))
3038 if alias.asname:
3039 result.extend([
3040 (self.name, 'as'),
3041 (self.name, alias.asname),
3042 ])
3043 return result
3044 #@+node:ekr.20220330133336.67: *6* iterative.ImportFrom
3045 # ImportFrom(identifier? module, alias* names, int? level)
3047 def do_ImportFrom(self, node: Node) -> ActionList:
3049 result: ActionList = [
3050 (self.name, 'from'),
3051 ]
3052 for i in range(node.level):
3053 result.append((self.op, '.'))
3054 if node.module:
3055 result.append((self.name, node.module))
3056 result.append((self.name, 'import'))
3057 # No need to put commas.
3058 for alias in node.names:
3059 if alias.name == '*': # #1851.
3060 result.append((self.op, '*'))
3061 else:
3062 result.append((self.name, alias.name))
3063 if alias.asname:
3064 result.extend([
3065 (self.name, 'as'),
3066 (self.name, alias.asname),
3067 ])
3068 return result
3069 #@+node:ekr.20220402124844.1: *6* iterative.Match* (Python 3.10+)
3070 # Match(expr subject, match_case* cases)
3072 # match_case = (pattern pattern, expr? guard, stmt* body)
3074 # Full syntax diagram: # https://peps.python.org/pep-0634/#appendix-a
3076 def do_Match(self, node: Node) -> ActionList:
3078 cases = getattr(node, 'cases', [])
3079 result: ActionList = [
3080 (self.name, 'match'),
3081 (self.visit, node.subject),
3082 (self.op, ':'),
3083 ]
3084 for case in cases:
3085 result.append((self.visit, case))
3086 return result
3087 #@+node:ekr.20220402124844.2: *7* iterative.match_case
3088 # match_case = (pattern pattern, expr? guard, stmt* body)
3090 def do_match_case(self, node: Node) -> ActionList:
3092 guard = getattr(node, 'guard', None)
3093 body = getattr(node, 'body', [])
3094 result: ActionList = [
3095 (self.name, 'case'),
3096 (self.visit, node.pattern),
3097 ]
3098 if guard:
3099 result.extend([
3100 (self.name, 'if'),
3101 (self.visit, guard),
3102 ])
3103 result.append((self.op, ':'))
3104 for statement in body:
3105 result.append((self.visit, statement))
3106 return result
3107 #@+node:ekr.20220402124844.3: *7* iterative.MatchAs
3108 # MatchAs(pattern? pattern, identifier? name)
3110 def do_MatchAs(self, node: Node) -> ActionList:
3111 pattern = getattr(node, 'pattern', None)
3112 name = getattr(node, 'name', None)
3113 result: ActionList = []
3114 if pattern and name:
3115 result.extend([
3116 (self.visit, pattern),
3117 (self.name, 'as'),
3118 (self.name, name),
3119 ])
3120 elif pattern:
3121 result.append((self.visit, pattern)) # pragma: no cover
3122 else:
3123 result.append((self.name, name or '_'))
3124 return result
3125 #@+node:ekr.20220402124844.4: *7* iterative.MatchClass
3126 # MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns)
3128 def do_MatchClass(self, node: Node) -> ActionList:
3130 patterns = getattr(node, 'patterns', [])
3131 kwd_attrs = getattr(node, 'kwd_attrs', [])
3132 kwd_patterns = getattr(node, 'kwd_patterns', [])
3133 result: ActionList = [
3134 (self.visit, node.cls),
3135 (self.op, '('),
3136 ]
3137 for pattern in patterns:
3138 result.append((self.visit, pattern))
3139 for i, kwd_attr in enumerate(kwd_attrs):
3140 result.extend([
3141 (self.name, kwd_attr), # a String.
3142 (self.op, '='),
3143 (self.visit, kwd_patterns[i]),
3144 ])
3145 result.append((self.op, ')'))
3146 return result
3147 #@+node:ekr.20220402124844.5: *7* iterative.MatchMapping
3148 # MatchMapping(expr* keys, pattern* patterns, identifier? rest)
3150 def do_MatchMapping(self, node: Node) -> ActionList:
3151 keys = getattr(node, 'keys', [])
3152 patterns = getattr(node, 'patterns', [])
3153 rest = getattr(node, 'rest', None)
3154 result: ActionList = [
3155 (self.op, '{'),
3156 ]
3157 for i, key in enumerate(keys):
3158 result.extend([
3159 (self.visit, key),
3160 (self.op, ':'),
3161 (self.visit, patterns[i]),
3162 ])
3163 if rest:
3164 result.extend([
3165 (self.op, '**'),
3166 (self.name, rest), # A string.
3167 ])
3168 result.append((self.op, '}'))
3169 return result
3170 #@+node:ekr.20220402124844.6: *7* iterative.MatchOr
3171 # MatchOr(pattern* patterns)
3173 def do_MatchOr(self, node: Node) -> ActionList:
3175 patterns = getattr(node, 'patterns', [])
3176 result: ActionList = []
3177 for i, pattern in enumerate(patterns):
3178 if i > 0:
3179 result.append((self.op, '|'))
3180 result.append((self.visit, pattern))
3181 return result
3182 #@+node:ekr.20220402124844.7: *7* iterative.MatchSequence
3183 # MatchSequence(pattern* patterns)
3185 def do_MatchSequence(self, node: Node) -> ActionList:
3186 patterns = getattr(node, 'patterns', [])
3187 result: ActionList = []
3188 # Scan for the next '(' or '[' token, skipping the 'case' token.
3189 token = None
3190 for token in self.tokens[self.px + 1 :]:
3191 if token.kind == 'op' and token.value in '([':
3192 break
3193 if is_significant_token(token):
3194 # An implicit tuple: there is no '(' or '[' token.
3195 token = None
3196 break
3197 else:
3198 raise AssignLinksError('Ill-formed tuple') # pragma: no cover
3199 if token:
3200 result.append((self.op, token.value))
3201 for i, pattern in enumerate(patterns):
3202 result.append((self.visit, pattern))
3203 if token:
3204 val = ']' if token.value == '[' else ')'
3205 result.append((self.op, val))
3206 return result
3207 #@+node:ekr.20220402124844.8: *7* iterative.MatchSingleton
3208 # MatchSingleton(constant value)
3210 def do_MatchSingleton(self, node: Node) -> ActionList:
3211 """Match True, False or None."""
3212 return [
3213 (self.token, ('name', repr(node.value))),
3214 ]
3215 #@+node:ekr.20220402124844.9: *7* iterative.MatchStar
3216 # MatchStar(identifier? name)
3218 def do_MatchStar(self, node: Node) -> ActionList:
3220 name = getattr(node, 'name', None)
3221 result: ActionList = [
3222 (self.op, '*'),
3223 ]
3224 if name:
3225 result.append((self.name, name))
3226 return result
3227 #@+node:ekr.20220402124844.10: *7* iterative.MatchValue
3228 # MatchValue(expr value)
3230 def do_MatchValue(self, node: Node) -> ActionList:
3232 return [
3233 (self.visit, node.value),
3234 ]
3235 #@+node:ekr.20220330133336.78: *6* iterative.Nonlocal
3236 # Nonlocal(identifier* names)
3238 def do_Nonlocal(self, node: Node) -> ActionList:
3240 # nonlocal %s\n' % ','.join(node.names))
3241 # No need to put commas.
3242 result: ActionList = [
3243 (self.name, 'nonlocal'),
3244 ]
3245 for z in node.names:
3246 result.append((self.name, z))
3247 return result
3248 #@+node:ekr.20220330133336.79: *6* iterative.Pass
3249 def do_Pass(self, node: Node) -> ActionList:
3251 return ([
3252 (self.name, 'pass'),
3253 ])
3254 #@+node:ekr.20220330133336.80: *6* iterative.Raise
3255 # Raise(expr? exc, expr? cause)
3257 def do_Raise(self, node: Node) -> ActionList:
3259 # No need to put commas.
3260 exc = getattr(node, 'exc', None)
3261 cause = getattr(node, 'cause', None)
3262 tback = getattr(node, 'tback', None)
3263 result: ActionList = [
3264 (self.name, 'raise'),
3265 (self.visit, exc),
3266 ]
3267 if cause:
3268 result.extend([
3269 (self.name, 'from'), # #2446.
3270 (self.visit, cause),
3271 ])
3272 result.append((self.visit, tback))
3273 return result
3275 #@+node:ekr.20220330133336.81: *6* iterative.Return
3276 def do_Return(self, node: Node) -> ActionList:
3278 return [
3279 (self.name, 'return'),
3280 (self.visit, node.value),
3281 ]
3282 #@+node:ekr.20220330133336.82: *6* iterative.Try
3283 # Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
3285 def do_Try(self, node: Node) -> ActionList:
3287 result: ActionList = [
3288 # Try line...
3289 (self.name, 'try'),
3290 (self.op, ':'),
3291 # Body...
3292 (self.visit, node.body),
3293 (self.visit, node.handlers),
3294 ]
3295 # Else...
3296 if node.orelse:
3297 result.extend([
3298 (self.name, 'else'),
3299 (self.op, ':'),
3300 (self.visit, node.orelse),
3301 ])
3302 # Finally...
3303 if node.finalbody:
3304 result.extend([
3305 (self.name, 'finally'),
3306 (self.op, ':'),
3307 (self.visit, node.finalbody),
3308 ])
3309 return result
3310 #@+node:ekr.20220330133336.83: *6* iterative.While
3311 def do_While(self, node: Node) -> ActionList:
3313 # While line...
3314 # while %s:\n'
3315 result: ActionList = [
3316 (self.name, 'while'),
3317 (self.visit, node.test),
3318 (self.op, ':'),
3319 # Body...
3320 (self.visit, node.body),
3321 ]
3322 # Else clause...
3323 if node.orelse:
3324 result.extend([
3325 (self.name, 'else'),
3326 (self.op, ':'),
3327 (self.visit, node.orelse),
3328 ])
3329 return result
3330 #@+node:ekr.20220330133336.84: *6* iterative.With
3331 # With(withitem* items, stmt* body)
3333 # withitem = (expr context_expr, expr? optional_vars)
3335 def do_With(self, node: Node) -> ActionList:
3337 expr: Optional[ast.AST] = getattr(node, 'context_expression', None)
3338 items: List[ast.AST] = getattr(node, 'items', [])
3339 result: ActionList = [
3340 (self.name, 'with'),
3341 (self.visit, expr),
3342 ]
3343 # No need to put commas.
3344 for item in items:
3345 result.append((self.visit, item.context_expr))
3346 optional_vars = getattr(item, 'optional_vars', None)
3347 if optional_vars is not None:
3348 result.extend([
3349 (self.name, 'as'),
3350 (self.visit, item.optional_vars),
3351 ])
3352 result.extend([
3353 # End the line.
3354 (self.op, ':'),
3355 # Body...
3356 (self.visit, node.body),
3357 ])
3358 return result
3359 #@+node:ekr.20220330133336.85: *6* iterative.Yield
3360 def do_Yield(self, node: Node) -> ActionList:
3362 result: ActionList = [
3363 (self.name, 'yield'),
3364 ]
3365 if hasattr(node, 'value'):
3366 result.extend([
3367 (self.visit, node.value),
3368 ])
3369 return result
3370 #@+node:ekr.20220330133336.86: *6* iterative.YieldFrom
3371 # YieldFrom(expr value)
3373 def do_YieldFrom(self, node: Node) -> ActionList:
3375 return ([
3376 (self.name, 'yield'),
3377 (self.name, 'from'),
3378 (self.visit, node.value),
3379 ])
3380 #@-others
3381#@+node:ekr.20200107165250.1: *3* class Orange
3382class Orange:
3383 """
3384 A flexible and powerful beautifier for Python.
3385 Orange is the new black.
3387 *Important*: This is a predominantly a *token*-based beautifier.
3388 However, orange.colon and orange.possible_unary_op use the parse
3389 tree to provide context that would otherwise be difficult to
3390 deduce.
3391 """
3392 # This switch is really a comment. It will always be false.
3393 # It marks the code that simulates the operation of the black tool.
3394 black_mode = False
3396 # Patterns...
3397 nobeautify_pat = re.compile(r'\s*#\s*pragma:\s*no\s*beautify\b|#\s*@@nobeautify')
3399 # Patterns from FastAtRead class, specialized for python delims.
3400 node_pat = re.compile(r'^(\s*)#@\+node:([^:]+): \*(\d+)?(\*?) (.*)$') # @node
3401 start_doc_pat = re.compile(r'^\s*#@\+(at|doc)?(\s.*?)?$') # @doc or @
3402 at_others_pat = re.compile(r'^(\s*)#@(\+|-)others\b(.*)$') # @others
3404 # Doc parts end with @c or a node sentinel. Specialized for python.
3405 end_doc_pat = re.compile(r"^\s*#@(@(c(ode)?)|([+]node\b.*))$")
3406 #@+others
3407 #@+node:ekr.20200107165250.2: *4* orange.ctor
3408 def __init__(self, settings: Optional[Dict[str, Any]]=None):
3409 """Ctor for Orange class."""
3410 if settings is None:
3411 settings = {}
3412 valid_keys = (
3413 'allow_joined_strings',
3414 'max_join_line_length',
3415 'max_split_line_length',
3416 'orange',
3417 'tab_width',
3418 )
3419 # For mypy...
3420 self.kind: str = ''
3421 # Default settings...
3422 self.allow_joined_strings = False # EKR's preference.
3423 self.max_join_line_length = 88
3424 self.max_split_line_length = 88
3425 self.tab_width = 4
3426 # Override from settings dict...
3427 for key in settings: # pragma: no cover
3428 value = settings.get(key)
3429 if key in valid_keys and value is not None:
3430 setattr(self, key, value)
3431 else:
3432 g.trace(f"Unexpected setting: {key} = {value!r}")
3433 #@+node:ekr.20200107165250.51: *4* orange.push_state
3434 def push_state(self, kind: str, value: Any=None) -> None:
3435 """Append a state to the state stack."""
3436 state = ParseState(kind, value)
3437 self.state_stack.append(state)
3438 #@+node:ekr.20200107165250.8: *4* orange: Entries
3439 #@+node:ekr.20200107173542.1: *5* orange.beautify (main token loop)
3440 def oops(self) -> None: # pragma: no cover
3441 g.trace(f"Unknown kind: {self.kind}")
3443 def beautify(self, contents: str, filename: str, tokens: List["Token"], tree: Node,
3445 max_join_line_length: Optional[int]=None, max_split_line_length: Optional[int]=None,
3446 ) -> str:
3447 """
3448 The main line. Create output tokens and return the result as a string.
3449 """
3450 # Config overrides
3451 if max_join_line_length is not None:
3452 self.max_join_line_length = max_join_line_length
3453 if max_split_line_length is not None:
3454 self.max_split_line_length = max_split_line_length
3455 # State vars...
3456 self.curly_brackets_level = 0 # Number of unmatched '{' tokens.
3457 self.decorator_seen = False # Set by do_name for do_op.
3458 self.in_arg_list = 0 # > 0 if in an arg list of a def.
3459 self.level = 0 # Set only by do_indent and do_dedent.
3460 self.lws = '' # Leading whitespace.
3461 self.paren_level = 0 # Number of unmatched '(' tokens.
3462 self.square_brackets_stack: List[bool] = [] # A stack of bools, for self.word().
3463 self.state_stack: List["ParseState"] = [] # Stack of ParseState objects.
3464 self.val = None # The input token's value (a string).
3465 self.verbatim = False # True: don't beautify.
3466 #
3467 # Init output list and state...
3468 self.code_list: List[Token] = [] # The list of output tokens.
3469 self.code_list_index = 0 # The token's index.
3470 self.tokens = tokens # The list of input tokens.
3471 self.tree = tree
3472 self.add_token('file-start', '')
3473 self.push_state('file-start')
3474 for i, token in enumerate(tokens):
3475 self.token = token
3476 self.kind, self.val, self.line = token.kind, token.value, token.line
3477 if self.verbatim:
3478 self.do_verbatim()
3479 else:
3480 func = getattr(self, f"do_{token.kind}", self.oops)
3481 func()
3482 # Any post pass would go here.
3483 return tokens_to_string(self.code_list)
3484 #@+node:ekr.20200107172450.1: *5* orange.beautify_file (entry)
3485 def beautify_file(self, filename: str) -> bool: # pragma: no cover
3486 """
3487 Orange: Beautify the the given external file.
3489 Return True if the file was changed.
3490 """
3491 self.filename = filename
3492 tog = TokenOrderGenerator()
3493 contents, encoding, tokens, tree = tog.init_from_file(filename)
3494 if not contents or not tokens or not tree:
3495 return False # #2529: Not an error.
3496 # Beautify.
3497 try:
3498 results = self.beautify(contents, filename, tokens, tree)
3499 except BeautifyError:
3500 return False # #2578.
3501 # Something besides newlines must change.
3502 if regularize_nls(contents) == regularize_nls(results):
3503 return False
3504 if 0: # This obscures more import error messages.
3505 show_diffs(contents, results, filename=filename)
3506 # Write the results
3507 print(f"Beautified: {g.shortFileName(filename)}")
3508 write_file(filename, results, encoding=encoding)
3509 return True
3510 #@+node:ekr.20200107172512.1: *5* orange.beautify_file_diff (entry)
3511 def beautify_file_diff(self, filename: str) -> bool: # pragma: no cover
3512 """
3513 Orange: Print the diffs that would resulf from the orange-file command.
3515 Return True if the file would be changed.
3516 """
3517 tag = 'diff-beautify-file'
3518 self.filename = filename
3519 tog = TokenOrderGenerator()
3520 contents, encoding, tokens, tree = tog.init_from_file(filename)
3521 if not contents or not tokens or not tree:
3522 print(f"{tag}: Can not beautify: {filename}")
3523 return False
3524 # fstringify.
3525 results = self.beautify(contents, filename, tokens, tree)
3526 # Something besides newlines must change.
3527 if regularize_nls(contents) == regularize_nls(results):
3528 print(f"{tag}: Unchanged: {filename}")
3529 return False
3530 # Show the diffs.
3531 show_diffs(contents, results, filename=filename)
3532 return True
3533 #@+node:ekr.20200107165250.13: *4* orange: Input token handlers
3534 #@+node:ekr.20200107165250.14: *5* orange.do_comment
3535 in_doc_part = False
3537 def do_comment(self) -> None:
3538 """Handle a comment token."""
3539 val = self.val
3540 #
3541 # Leo-specific code...
3542 if self.node_pat.match(val):
3543 # Clear per-node state.
3544 self.in_doc_part = False
3545 self.verbatim = False
3546 self.decorator_seen = False
3547 # Do *not clear other state, which may persist across @others.
3548 # self.curly_brackets_level = 0
3549 # self.in_arg_list = 0
3550 # self.level = 0
3551 # self.lws = ''
3552 # self.paren_level = 0
3553 # self.square_brackets_stack = []
3554 # self.state_stack = []
3555 else:
3556 # Keep track of verbatim mode.
3557 if self.beautify_pat.match(val):
3558 self.verbatim = False
3559 elif self.nobeautify_pat.match(val):
3560 self.verbatim = True
3561 # Keep trace of @doc parts, to honor the convention for splitting lines.
3562 if self.start_doc_pat.match(val):
3563 self.in_doc_part = True
3564 if self.end_doc_pat.match(val):
3565 self.in_doc_part = False
3566 #
3567 # General code: Generate the comment.
3568 self.clean('blank')
3569 entire_line = self.line.lstrip().startswith('#')
3570 if entire_line:
3571 self.clean('hard-blank')
3572 self.clean('line-indent')
3573 # #1496: No further munging needed.
3574 val = self.line.rstrip()
3575 else:
3576 # Exactly two spaces before trailing comments.
3577 val = ' ' + self.val.rstrip()
3578 self.add_token('comment', val)
3579 #@+node:ekr.20200107165250.15: *5* orange.do_encoding
3580 def do_encoding(self) -> None:
3581 """
3582 Handle the encoding token.
3583 """
3584 pass
3585 #@+node:ekr.20200107165250.16: *5* orange.do_endmarker
3586 def do_endmarker(self) -> None:
3587 """Handle an endmarker token."""
3588 # Ensure exactly one blank at the end of the file.
3589 self.clean_blank_lines()
3590 self.add_token('line-end', '\n')
3591 #@+node:ekr.20200107165250.18: *5* orange.do_indent & do_dedent & helper
3592 # Note: other methods use self.level.
3594 def do_dedent(self) -> None:
3595 """Handle dedent token."""
3596 self.level -= 1
3597 self.lws = self.level * self.tab_width * ' '
3598 self.line_indent()
3599 if self.black_mode: # pragma: no cover (black)
3600 state = self.state_stack[-1]
3601 if state.kind == 'indent' and state.value == self.level:
3602 self.state_stack.pop()
3603 state = self.state_stack[-1]
3604 if state.kind in ('class', 'def'):
3605 self.state_stack.pop()
3606 self.handle_dedent_after_class_or_def(state.kind)
3608 def do_indent(self) -> None:
3609 """Handle indent token."""
3610 # #2578: Refuse to beautify files containing leading tabs or unusual indentation.
3611 consider_message = 'consider using python/Tools/scripts/reindent.py'
3612 if '\t' in self.val:
3613 message = f"Leading tabs found: {self.filename}"
3614 print(message)
3615 print(consider_message)
3616 raise BeautifyError(message)
3617 if (len(self.val) % self.tab_width) != 0:
3618 message = f" Indentation error: {self.filename}"
3619 print(message)
3620 print(consider_message)
3621 raise BeautifyError(message)
3622 new_indent = self.val
3623 old_indent = self.level * self.tab_width * ' '
3624 if new_indent > old_indent:
3625 self.level += 1
3626 elif new_indent < old_indent: # pragma: no cover (defensive)
3627 g.trace('\n===== can not happen', repr(new_indent), repr(old_indent))
3628 self.lws = new_indent
3629 self.line_indent()
3630 #@+node:ekr.20200220054928.1: *6* orange.handle_dedent_after_class_or_def
3631 def handle_dedent_after_class_or_def(self, kind: str) -> None: # pragma: no cover (black)
3632 """
3633 Insert blank lines after a class or def as the result of a 'dedent' token.
3635 Normal comment lines may precede the 'dedent'.
3636 Insert the blank lines *before* such comment lines.
3637 """
3638 #
3639 # Compute the tail.
3640 i = len(self.code_list) - 1
3641 tail: List[Token] = []
3642 while i > 0:
3643 t = self.code_list.pop()
3644 i -= 1
3645 if t.kind == 'line-indent':
3646 pass
3647 elif t.kind == 'line-end':
3648 tail.insert(0, t)
3649 elif t.kind == 'comment':
3650 # Only underindented single-line comments belong in the tail.
3651 # @+node comments must never be in the tail.
3652 single_line = self.code_list[i].kind in ('line-end', 'line-indent')
3653 lws = len(t.value) - len(t.value.lstrip())
3654 underindent = lws <= len(self.lws)
3655 if underindent and single_line and not self.node_pat.match(t.value):
3656 # A single-line comment.
3657 tail.insert(0, t)
3658 else:
3659 self.code_list.append(t)
3660 break
3661 else:
3662 self.code_list.append(t)
3663 break
3664 #
3665 # Remove leading 'line-end' tokens from the tail.
3666 while tail and tail[0].kind == 'line-end':
3667 tail = tail[1:]
3668 #
3669 # Put the newlines *before* the tail.
3670 # For Leo, always use 1 blank lines.
3671 n = 1 # n = 2 if kind == 'class' else 1
3672 # Retain the token (intention) for debugging.
3673 self.add_token('blank-lines', n)
3674 for i in range(0, n + 1):
3675 self.add_token('line-end', '\n')
3676 if tail:
3677 self.code_list.extend(tail)
3678 self.line_indent()
3679 #@+node:ekr.20200107165250.20: *5* orange.do_name
3680 def do_name(self) -> None:
3681 """Handle a name token."""
3682 name = self.val
3683 if self.black_mode and name in ('class', 'def'): # pragma: no cover (black)
3684 # Handle newlines before and after 'class' or 'def'
3685 self.decorator_seen = False
3686 state = self.state_stack[-1]
3687 if state.kind == 'decorator':
3688 # Always do this, regardless of @bool clean-blank-lines.
3689 self.clean_blank_lines()
3690 # Suppress split/join.
3691 self.add_token('hard-newline', '\n')
3692 self.add_token('line-indent', self.lws)
3693 self.state_stack.pop()
3694 else:
3695 # Always do this, regardless of @bool clean-blank-lines.
3696 self.blank_lines(2 if name == 'class' else 1)
3697 self.push_state(name)
3698 # For trailing lines after inner classes/defs.
3699 self.push_state('indent', self.level)
3700 self.word(name)
3701 return
3702 #
3703 # Leo mode...
3704 if name in ('class', 'def'):
3705 self.word(name)
3706 elif name in (
3707 'and', 'elif', 'else', 'for', 'if', 'in', 'not', 'not in', 'or', 'while'
3708 ):
3709 self.word_op(name)
3710 else:
3711 self.word(name)
3712 #@+node:ekr.20200107165250.21: *5* orange.do_newline & do_nl
3713 def do_newline(self) -> None:
3714 """Handle a regular newline."""
3715 self.line_end()
3717 def do_nl(self) -> None:
3718 """Handle a continuation line."""
3719 self.line_end()
3720 #@+node:ekr.20200107165250.22: *5* orange.do_number
3721 def do_number(self) -> None:
3722 """Handle a number token."""
3723 self.blank()
3724 self.add_token('number', self.val)
3725 #@+node:ekr.20200107165250.23: *5* orange.do_op
3726 def do_op(self) -> None:
3727 """Handle an op token."""
3728 val = self.val
3729 if val == '.':
3730 self.clean('blank')
3731 prev = self.code_list[-1]
3732 # #2495 & #2533: Special case for 'from .'
3733 if prev.kind == 'word' and prev.value == 'from':
3734 self.blank()
3735 self.add_token('op-no-blanks', val)
3736 elif val == '@':
3737 if self.black_mode: # pragma: no cover (black)
3738 if not self.decorator_seen:
3739 self.blank_lines(1)
3740 self.decorator_seen = True
3741 self.clean('blank')
3742 self.add_token('op-no-blanks', val)
3743 self.push_state('decorator')
3744 elif val == ':':
3745 # Treat slices differently.
3746 self.colon(val)
3747 elif val in ',;':
3748 # Pep 8: Avoid extraneous whitespace immediately before
3749 # comma, semicolon, or colon.
3750 self.clean('blank')
3751 self.add_token('op', val)
3752 self.blank()
3753 elif val in '([{':
3754 # Pep 8: Avoid extraneous whitespace immediately inside
3755 # parentheses, brackets or braces.
3756 self.lt(val)
3757 elif val in ')]}':
3758 # Ditto.
3759 self.rt(val)
3760 elif val == '=':
3761 # Pep 8: Don't use spaces around the = sign when used to indicate
3762 # a keyword argument or a default parameter value.
3763 if self.paren_level:
3764 self.clean('blank')
3765 self.add_token('op-no-blanks', val)
3766 else:
3767 self.blank()
3768 self.add_token('op', val)
3769 self.blank()
3770 elif val in '~+-':
3771 self.possible_unary_op(val)
3772 elif val == '*':
3773 self.star_op()
3774 elif val == '**':
3775 self.star_star_op()
3776 else:
3777 # Pep 8: always surround binary operators with a single space.
3778 # '==','+=','-=','*=','**=','/=','//=','%=','!=','<=','>=','<','>',
3779 # '^','~','*','**','&','|','/','//',
3780 # Pep 8: If operators with different priorities are used,
3781 # consider adding whitespace around the operators with the lowest priority(ies).
3782 self.blank()
3783 self.add_token('op', val)
3784 self.blank()
3785 #@+node:ekr.20200107165250.24: *5* orange.do_string
3786 def do_string(self) -> None:
3787 """Handle a 'string' token."""
3788 # Careful: continued strings may contain '\r'
3789 val = regularize_nls(self.val)
3790 self.add_token('string', val)
3791 self.blank()
3792 #@+node:ekr.20200210175117.1: *5* orange.do_verbatim
3793 beautify_pat = re.compile(
3794 r'#\s*pragma:\s*beautify\b|#\s*@@beautify|#\s*@\+node|#\s*@[+-]others|#\s*@[+-]<<')
3796 def do_verbatim(self) -> None:
3797 """
3798 Handle one token in verbatim mode.
3799 End verbatim mode when the appropriate comment is seen.
3800 """
3801 kind = self.kind
3802 #
3803 # Careful: tokens may contain '\r'
3804 val = regularize_nls(self.val)
3805 if kind == 'comment':
3806 if self.beautify_pat.match(val):
3807 self.verbatim = False
3808 val = val.rstrip()
3809 self.add_token('comment', val)
3810 return
3811 if kind == 'indent':
3812 self.level += 1
3813 self.lws = self.level * self.tab_width * ' '
3814 if kind == 'dedent':
3815 self.level -= 1
3816 self.lws = self.level * self.tab_width * ' '
3817 self.add_token('verbatim', val)
3818 #@+node:ekr.20200107165250.25: *5* orange.do_ws
3819 def do_ws(self) -> None:
3820 """
3821 Handle the "ws" pseudo-token.
3823 Put the whitespace only if if ends with backslash-newline.
3824 """
3825 val = self.val
3826 # Handle backslash-newline.
3827 if '\\\n' in val:
3828 self.clean('blank')
3829 self.add_token('op-no-blanks', val)
3830 return
3831 # Handle start-of-line whitespace.
3832 prev = self.code_list[-1]
3833 inner = self.paren_level or self.square_brackets_stack or self.curly_brackets_level
3834 if prev.kind == 'line-indent' and inner:
3835 # Retain the indent that won't be cleaned away.
3836 self.clean('line-indent')
3837 self.add_token('hard-blank', val)
3838 #@+node:ekr.20200107165250.26: *4* orange: Output token generators
3839 #@+node:ekr.20200118145044.1: *5* orange.add_line_end
3840 def add_line_end(self) -> "Token":
3841 """Add a line-end request to the code list."""
3842 # This may be called from do_name as well as do_newline and do_nl.
3843 assert self.token.kind in ('newline', 'nl'), self.token.kind
3844 self.clean('blank') # Important!
3845 self.clean('line-indent')
3846 t = self.add_token('line-end', '\n')
3847 # Distinguish between kinds of 'line-end' tokens.
3848 t.newline_kind = self.token.kind
3849 return t
3850 #@+node:ekr.20200107170523.1: *5* orange.add_token
3851 def add_token(self, kind: str, value: Any) -> "Token":
3852 """Add an output token to the code list."""
3853 tok = Token(kind, value)
3854 tok.index = self.code_list_index # For debugging only.
3855 self.code_list_index += 1
3856 self.code_list.append(tok)
3857 return tok
3858 #@+node:ekr.20200107165250.27: *5* orange.blank
3859 def blank(self) -> None:
3860 """Add a blank request to the code list."""
3861 prev = self.code_list[-1]
3862 if prev.kind not in (
3863 'blank',
3864 'blank-lines',
3865 'file-start',
3866 'hard-blank', # Unique to orange.
3867 'line-end',
3868 'line-indent',
3869 'lt',
3870 'op-no-blanks',
3871 'unary-op',
3872 ):
3873 self.add_token('blank', ' ')
3874 #@+node:ekr.20200107165250.29: *5* orange.blank_lines (black only)
3875 def blank_lines(self, n: int) -> None: # pragma: no cover (black)
3876 """
3877 Add a request for n blank lines to the code list.
3878 Multiple blank-lines request yield at least the maximum of all requests.
3879 """
3880 self.clean_blank_lines()
3881 prev = self.code_list[-1]
3882 if prev.kind == 'file-start':
3883 self.add_token('blank-lines', n)
3884 return
3885 for i in range(0, n + 1):
3886 self.add_token('line-end', '\n')
3887 # Retain the token (intention) for debugging.
3888 self.add_token('blank-lines', n)
3889 self.line_indent()
3890 #@+node:ekr.20200107165250.30: *5* orange.clean
3891 def clean(self, kind: str) -> None:
3892 """Remove the last item of token list if it has the given kind."""
3893 prev = self.code_list[-1]
3894 if prev.kind == kind:
3895 self.code_list.pop()
3896 #@+node:ekr.20200107165250.31: *5* orange.clean_blank_lines
3897 def clean_blank_lines(self) -> bool:
3898 """
3899 Remove all vestiges of previous blank lines.
3901 Return True if any of the cleaned 'line-end' tokens represented "hard" newlines.
3902 """
3903 cleaned_newline = False
3904 table = ('blank-lines', 'line-end', 'line-indent')
3905 while self.code_list[-1].kind in table:
3906 t = self.code_list.pop()
3907 if t.kind == 'line-end' and getattr(t, 'newline_kind', None) != 'nl':
3908 cleaned_newline = True
3909 return cleaned_newline
3910 #@+node:ekr.20200107165250.32: *5* orange.colon
3911 def colon(self, val: str) -> None:
3912 """Handle a colon."""
3914 def is_expr(node: Node) -> bool:
3915 """True if node is any expression other than += number."""
3916 if isinstance(node, (ast.BinOp, ast.Call, ast.IfExp)):
3917 return True
3918 return (
3919 isinstance(node, ast.UnaryOp)
3920 and not isinstance(node.operand, ast.Num)
3921 )
3923 node = self.token.node
3924 self.clean('blank')
3925 if not isinstance(node, ast.Slice):
3926 self.add_token('op', val)
3927 self.blank()
3928 return
3929 # A slice.
3930 lower = getattr(node, 'lower', None)
3931 upper = getattr(node, 'upper', None)
3932 step = getattr(node, 'step', None)
3933 if any(is_expr(z) for z in (lower, upper, step)):
3934 prev = self.code_list[-1]
3935 if prev.value not in '[:':
3936 self.blank()
3937 self.add_token('op', val)
3938 self.blank()
3939 else:
3940 self.add_token('op-no-blanks', val)
3941 #@+node:ekr.20200107165250.33: *5* orange.line_end
3942 def line_end(self) -> None:
3943 """Add a line-end request to the code list."""
3944 # This should be called only be do_newline and do_nl.
3945 node, token = self.token.statement_node, self.token
3946 assert token.kind in ('newline', 'nl'), (token.kind, g.callers())
3947 # Create the 'line-end' output token.
3948 self.add_line_end()
3949 # Attempt to split the line.
3950 was_split = self.split_line(node, token)
3951 # Attempt to join the line only if it has not just been split.
3952 if not was_split and self.max_join_line_length > 0:
3953 self.join_lines(node, token)
3954 # Add the indentation for all lines
3955 # until the next indent or unindent token.
3956 self.line_indent()
3957 #@+node:ekr.20200107165250.40: *5* orange.line_indent
3958 def line_indent(self) -> None:
3959 """Add a line-indent token."""
3960 self.clean('line-indent') # Defensive. Should never happen.
3961 self.add_token('line-indent', self.lws)
3962 #@+node:ekr.20200107165250.41: *5* orange.lt & rt
3963 #@+node:ekr.20200107165250.42: *6* orange.lt
3964 def lt(self, val: str) -> None:
3965 """Generate code for a left paren or curly/square bracket."""
3966 assert val in '([{', repr(val)
3967 if val == '(':
3968 self.paren_level += 1
3969 elif val == '[':
3970 self.square_brackets_stack.append(False)
3971 else:
3972 self.curly_brackets_level += 1
3973 self.clean('blank')
3974 prev = self.code_list[-1]
3975 if prev.kind in ('op', 'word-op'):
3976 self.blank()
3977 self.add_token('lt', val)
3978 elif prev.kind == 'word':
3979 # Only suppress blanks before '(' or '[' for non-keyworks.
3980 if val == '{' or prev.value in ('if', 'else', 'return', 'for'):
3981 self.blank()
3982 elif val == '(':
3983 self.in_arg_list += 1
3984 self.add_token('lt', val)
3985 else:
3986 self.clean('blank')
3987 self.add_token('op-no-blanks', val)
3988 #@+node:ekr.20200107165250.43: *6* orange.rt
3989 def rt(self, val: str) -> None:
3990 """Generate code for a right paren or curly/square bracket."""
3991 assert val in ')]}', repr(val)
3992 if val == ')':
3993 self.paren_level -= 1
3994 self.in_arg_list = max(0, self.in_arg_list - 1)
3995 elif val == ']':
3996 self.square_brackets_stack.pop()
3997 else:
3998 self.curly_brackets_level -= 1
3999 self.clean('blank')
4000 self.add_token('rt', val)
4001 #@+node:ekr.20200107165250.45: *5* orange.possible_unary_op & unary_op
4002 def possible_unary_op(self, s: str) -> None:
4003 """Add a unary or binary op to the token list."""
4004 node = self.token.node
4005 self.clean('blank')
4006 if isinstance(node, ast.UnaryOp):
4007 self.unary_op(s)
4008 else:
4009 self.blank()
4010 self.add_token('op', s)
4011 self.blank()
4013 def unary_op(self, s: str) -> None:
4014 """Add an operator request to the code list."""
4015 assert s and isinstance(s, str), repr(s)
4016 self.clean('blank')
4017 prev = self.code_list[-1]
4018 if prev.kind == 'lt':
4019 self.add_token('unary-op', s)
4020 else:
4021 self.blank()
4022 self.add_token('unary-op', s)
4023 #@+node:ekr.20200107165250.46: *5* orange.star_op
4024 def star_op(self) -> None:
4025 """Put a '*' op, with special cases for *args."""
4026 val = '*'
4027 node = self.token.node
4028 self.clean('blank')
4029 if isinstance(node, ast.arguments):
4030 self.blank()
4031 self.add_token('op', val)
4032 return # #2533
4033 if self.paren_level > 0:
4034 prev = self.code_list[-1]
4035 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','):
4036 self.blank()
4037 self.add_token('op', val)
4038 return
4039 self.blank()
4040 self.add_token('op', val)
4041 self.blank()
4042 #@+node:ekr.20200107165250.47: *5* orange.star_star_op
4043 def star_star_op(self) -> None:
4044 """Put a ** operator, with a special case for **kwargs."""
4045 val = '**'
4046 node = self.token.node
4047 self.clean('blank')
4048 if isinstance(node, ast.arguments):
4049 self.blank()
4050 self.add_token('op', val)
4051 return # #2533
4052 if self.paren_level > 0:
4053 prev = self.code_list[-1]
4054 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','):
4055 self.blank()
4056 self.add_token('op', val)
4057 return
4058 self.blank()
4059 self.add_token('op', val)
4060 self.blank()
4061 #@+node:ekr.20200107165250.48: *5* orange.word & word_op
4062 def word(self, s: str) -> None:
4063 """Add a word request to the code list."""
4064 assert s and isinstance(s, str), repr(s)
4065 node = self.token.node
4066 if isinstance(node, ast.ImportFrom) and s == 'import': # #2533
4067 self.clean('blank')
4068 self.add_token('blank', ' ')
4069 self.add_token('word', s)
4070 elif self.square_brackets_stack:
4071 # A previous 'op-no-blanks' token may cancel this blank.
4072 self.blank()
4073 self.add_token('word', s)
4074 elif self.in_arg_list > 0:
4075 self.add_token('word', s)
4076 self.blank()
4077 else:
4078 self.blank()
4079 self.add_token('word', s)
4080 self.blank()
4082 def word_op(self, s: str) -> None:
4083 """Add a word-op request to the code list."""
4084 assert s and isinstance(s, str), repr(s)
4085 self.blank()
4086 self.add_token('word-op', s)
4087 self.blank()
4088 #@+node:ekr.20200118120049.1: *4* orange: Split/join
4089 #@+node:ekr.20200107165250.34: *5* orange.split_line & helpers
4090 def split_line(self, node: Node, token: "Token") -> bool:
4091 """
4092 Split token's line, if possible and enabled.
4094 Return True if the line was broken into two or more lines.
4095 """
4096 assert token.kind in ('newline', 'nl'), repr(token)
4097 # Return if splitting is disabled:
4098 if self.max_split_line_length <= 0: # pragma: no cover (user option)
4099 return False
4100 # Return if the node can't be split.
4101 if not is_long_statement(node):
4102 return False
4103 # Find the *output* tokens of the previous lines.
4104 line_tokens = self.find_prev_line()
4105 line_s = ''.join([z.to_string() for z in line_tokens])
4106 # Do nothing for short lines.
4107 if len(line_s) < self.max_split_line_length:
4108 return False
4109 # Return if the previous line has no opening delim: (, [ or {.
4110 if not any(z.kind == 'lt' for z in line_tokens): # pragma: no cover (defensive)
4111 return False
4112 prefix = self.find_line_prefix(line_tokens)
4113 # Calculate the tail before cleaning the prefix.
4114 tail = line_tokens[len(prefix) :]
4115 # Cut back the token list: subtract 1 for the trailing line-end.
4116 self.code_list = self.code_list[: len(self.code_list) - len(line_tokens) - 1]
4117 # Append the tail, splitting it further, as needed.
4118 self.append_tail(prefix, tail)
4119 # Add the line-end token deleted by find_line_prefix.
4120 self.add_token('line-end', '\n')
4121 return True
4122 #@+node:ekr.20200107165250.35: *6* orange.append_tail
4123 def append_tail(self, prefix: List["Token"], tail: List["Token"]) -> None:
4124 """Append the tail tokens, splitting the line further as necessary."""
4125 tail_s = ''.join([z.to_string() for z in tail])
4126 if len(tail_s) < self.max_split_line_length:
4127 # Add the prefix.
4128 self.code_list.extend(prefix)
4129 # Start a new line and increase the indentation.
4130 self.add_token('line-end', '\n')
4131 self.add_token('line-indent', self.lws + ' ' * 4)
4132 self.code_list.extend(tail)
4133 return
4134 # Still too long. Split the line at commas.
4135 self.code_list.extend(prefix)
4136 # Start a new line and increase the indentation.
4137 self.add_token('line-end', '\n')
4138 self.add_token('line-indent', self.lws + ' ' * 4)
4139 open_delim = Token(kind='lt', value=prefix[-1].value)
4140 value = open_delim.value.replace('(', ')').replace('[', ']').replace('{', '}')
4141 close_delim = Token(kind='rt', value=value)
4142 delim_count = 1
4143 lws = self.lws + ' ' * 4
4144 for i, t in enumerate(tail):
4145 if t.kind == 'op' and t.value == ',':
4146 if delim_count == 1:
4147 # Start a new line.
4148 self.add_token('op-no-blanks', ',')
4149 self.add_token('line-end', '\n')
4150 self.add_token('line-indent', lws)
4151 # Kill a following blank.
4152 if i + 1 < len(tail):
4153 next_t = tail[i + 1]
4154 if next_t.kind == 'blank':
4155 next_t.kind = 'no-op'
4156 next_t.value = ''
4157 else:
4158 self.code_list.append(t)
4159 elif t.kind == close_delim.kind and t.value == close_delim.value:
4160 # Done if the delims match.
4161 delim_count -= 1
4162 if delim_count == 0:
4163 # Start a new line
4164 self.add_token('op-no-blanks', ',')
4165 self.add_token('line-end', '\n')
4166 self.add_token('line-indent', self.lws)
4167 self.code_list.extend(tail[i:])
4168 return
4169 lws = lws[:-4]
4170 self.code_list.append(t)
4171 elif t.kind == open_delim.kind and t.value == open_delim.value:
4172 delim_count += 1
4173 lws = lws + ' ' * 4
4174 self.code_list.append(t)
4175 else:
4176 self.code_list.append(t)
4177 g.trace('BAD DELIMS', delim_count) # pragma: no cover
4178 #@+node:ekr.20200107165250.36: *6* orange.find_prev_line
4179 def find_prev_line(self) -> List["Token"]:
4180 """Return the previous line, as a list of tokens."""
4181 line = []
4182 for t in reversed(self.code_list[:-1]):
4183 if t.kind in ('hard-newline', 'line-end'):
4184 break
4185 line.append(t)
4186 return list(reversed(line))
4187 #@+node:ekr.20200107165250.37: *6* orange.find_line_prefix
4188 def find_line_prefix(self, token_list: List["Token"]) -> List["Token"]:
4189 """
4190 Return all tokens up to and including the first lt token.
4191 Also add all lt tokens directly following the first lt token.
4192 """
4193 result = []
4194 for i, t in enumerate(token_list):
4195 result.append(t)
4196 if t.kind == 'lt':
4197 break
4198 return result
4199 #@+node:ekr.20200107165250.39: *5* orange.join_lines
4200 def join_lines(self, node: Node, token: "Token") -> None:
4201 """
4202 Join preceding lines, if possible and enabled.
4203 token is a line_end token. node is the corresponding ast node.
4204 """
4205 if self.max_join_line_length <= 0: # pragma: no cover (user option)
4206 return
4207 assert token.kind in ('newline', 'nl'), repr(token)
4208 if token.kind == 'nl':
4209 return
4210 # Scan backward in the *code* list,
4211 # looking for 'line-end' tokens with tok.newline_kind == 'nl'
4212 nls = 0
4213 i = len(self.code_list) - 1
4214 t = self.code_list[i]
4215 assert t.kind == 'line-end', repr(t)
4216 # Not all tokens have a newline_kind ivar.
4217 assert t.newline_kind == 'newline'
4218 i -= 1
4219 while i >= 0:
4220 t = self.code_list[i]
4221 if t.kind == 'comment':
4222 # Can't join.
4223 return
4224 if t.kind == 'string' and not self.allow_joined_strings:
4225 # An EKR preference: don't join strings, no matter what black does.
4226 # This allows "short" f-strings to be aligned.
4227 return
4228 if t.kind == 'line-end':
4229 if getattr(t, 'newline_kind', None) == 'nl':
4230 nls += 1
4231 else:
4232 break # pragma: no cover
4233 i -= 1
4234 # Retain at the file-start token.
4235 if i <= 0:
4236 i = 1
4237 if nls <= 0: # pragma: no cover (rare)
4238 return
4239 # Retain line-end and and any following line-indent.
4240 # Required, so that the regex below won't eat too much.
4241 while True:
4242 t = self.code_list[i]
4243 if t.kind == 'line-end':
4244 if getattr(t, 'newline_kind', None) == 'nl': # pragma: no cover (rare)
4245 nls -= 1
4246 i += 1
4247 elif self.code_list[i].kind == 'line-indent':
4248 i += 1
4249 else:
4250 break # pragma: no cover (defensive)
4251 if nls <= 0: # pragma: no cover (defensive)
4252 return
4253 # Calculate the joined line.
4254 tail = self.code_list[i:]
4255 tail_s = tokens_to_string(tail)
4256 tail_s = re.sub(r'\n\s*', ' ', tail_s)
4257 tail_s = tail_s.replace('( ', '(').replace(' )', ')')
4258 tail_s = tail_s.rstrip()
4259 # Don't join the lines if they would be too long.
4260 if len(tail_s) > self.max_join_line_length: # pragma: no cover (defensive)
4261 return
4262 # Cut back the code list.
4263 self.code_list = self.code_list[:i]
4264 # Add the new output tokens.
4265 self.add_token('string', tail_s)
4266 self.add_token('line-end', '\n')
4267 #@-others
4268#@+node:ekr.20200107170126.1: *3* class ParseState
4269class ParseState:
4270 """
4271 A class representing items in the parse state stack.
4273 The present states:
4275 'file-start': Ensures the stack stack is never empty.
4277 'decorator': The last '@' was a decorator.
4279 do_op(): push_state('decorator')
4280 do_name(): pops the stack if state.kind == 'decorator'.
4282 'indent': The indentation level for 'class' and 'def' names.
4284 do_name(): push_state('indent', self.level)
4285 do_dendent(): pops the stack once or twice if state.value == self.level.
4287 """
4289 def __init__(self, kind: str, value: str) -> None:
4290 self.kind = kind
4291 self.value = value
4293 def __repr__(self) -> str:
4294 return f"State: {self.kind} {self.value!r}" # pragma: no cover
4296 __str__ = __repr__
4297#@+node:ekr.20191231084514.1: *3* class ReassignTokens
4298class ReassignTokens:
4299 """A class that reassigns tokens to more appropriate ast nodes."""
4300 #@+others
4301 #@+node:ekr.20191231084640.1: *4* reassign.reassign
4302 def reassign(self, filename: str, tokens: List["Token"], tree: Node) -> None:
4303 """The main entry point."""
4304 self.filename = filename
4305 self.tokens = tokens
4306 # For now, just handle Call nodes.
4307 for node in ast.walk(tree):
4308 if isinstance(node, ast.Call):
4309 self.visit_call(node)
4310 #@+node:ekr.20191231084853.1: *4* reassign.visit_call
4311 def visit_call(self, node: Node) -> None:
4312 """ReassignTokens.visit_call"""
4313 tokens = tokens_for_node(self.filename, node, self.tokens)
4314 node0, node9 = tokens[0].node, tokens[-1].node
4315 nca = nearest_common_ancestor(node0, node9)
4316 if not nca:
4317 return
4318 # Associate () with the call node.
4319 i = tokens[-1].index
4320 j = find_paren_token(i + 1, self.tokens)
4321 if j is None:
4322 return # pragma: no cover
4323 k = find_paren_token(j + 1, self.tokens)
4324 if k is None:
4325 return # pragma: no cover
4326 self.tokens[j].node = nca
4327 self.tokens[k].node = nca
4328 add_token_to_token_list(self.tokens[j], nca)
4329 add_token_to_token_list(self.tokens[k], nca)
4330 #@-others
4331#@+node:ekr.20191110080535.1: *3* class Token
4332class Token:
4333 """
4334 A class representing a 5-tuple, plus additional data.
4335 """
4337 def __init__(self, kind: str, value: str):
4339 self.kind = kind
4340 self.value = value
4341 #
4342 # Injected by Tokenizer.add_token.
4343 self.five_tuple = None
4344 self.index = 0
4345 # The entire line containing the token.
4346 # Same as five_tuple.line.
4347 self.line = ''
4348 # The line number, for errors and dumps.
4349 # Same as five_tuple.start[0]
4350 self.line_number = 0
4351 #
4352 # Injected by Tokenizer.add_token.
4353 self.level = 0
4354 self.node: Optional[Node] = None
4356 def __repr__(self) -> str: # pragma: no cover
4357 nl_kind = getattr(self, 'newline_kind', '')
4358 s = f"{self.kind:}.{self.index:<3}"
4359 return f"{s:>18}:{nl_kind:7} {self.show_val(80)}"
4361 def __str__(self) -> str: # pragma: no cover
4362 nl_kind = getattr(self, 'newline_kind', '')
4363 return f"{self.kind}.{self.index:<3}{nl_kind:8} {self.show_val(80)}"
4365 def to_string(self) -> str:
4366 """Return the contribution of the token to the source file."""
4367 return self.value if isinstance(self.value, str) else ''
4368 #@+others
4369 #@+node:ekr.20191231114927.1: *4* token.brief_dump
4370 def brief_dump(self) -> str: # pragma: no cover
4371 """Dump a token."""
4372 return (
4373 f"{self.index:>3} line: {self.line_number:<2} "
4374 f"{self.kind:>11} {self.show_val(100)}")
4375 #@+node:ekr.20200223022950.11: *4* token.dump
4376 def dump(self) -> str: # pragma: no cover
4377 """Dump a token and related links."""
4378 # Let block.
4379 node_id = self.node.node_index if self.node else ''
4380 node_cn = self.node.__class__.__name__ if self.node else ''
4381 return (
4382 f"{self.line_number:4} "
4383 f"{node_id:5} {node_cn:16} "
4384 f"{self.index:>5} {self.kind:>11} "
4385 f"{self.show_val(100)}")
4386 #@+node:ekr.20200121081151.1: *4* token.dump_header
4387 def dump_header(self) -> None: # pragma: no cover
4388 """Print the header for token.dump"""
4389 print(
4390 f"\n"
4391 f" node {'':10} token token\n"
4392 f"line index class {'':10} index kind value\n"
4393 f"==== ===== ===== {'':10} ===== ==== =====\n")
4394 #@+node:ekr.20191116154328.1: *4* token.error_dump
4395 def error_dump(self) -> str: # pragma: no cover
4396 """Dump a token or result node for error message."""
4397 if self.node:
4398 node_id = obj_id(self.node)
4399 node_s = f"{node_id} {self.node.__class__.__name__}"
4400 else:
4401 node_s = "None"
4402 return (
4403 f"index: {self.index:<3} {self.kind:>12} {self.show_val(20):<20} "
4404 f"{node_s}")
4405 #@+node:ekr.20191113095507.1: *4* token.show_val
4406 def show_val(self, truncate_n: int) -> str: # pragma: no cover
4407 """Return the token.value field."""
4408 if self.kind in ('ws', 'indent'):
4409 val = str(len(self.value))
4410 elif self.kind == 'string':
4411 # Important: don't add a repr for 'string' tokens.
4412 # repr just adds another layer of confusion.
4413 val = g.truncate(self.value, truncate_n)
4414 else:
4415 val = g.truncate(repr(self.value), truncate_n)
4416 return val
4417 #@-others
4418#@+node:ekr.20191110165235.1: *3* class Tokenizer
4419class Tokenizer:
4421 """Create a list of Tokens from contents."""
4423 results: List[Token] = []
4425 #@+others
4426 #@+node:ekr.20191110165235.2: *4* tokenizer.add_token
4427 token_index = 0
4428 prev_line_token = None
4430 def add_token(self, kind: str, five_tuple: Any, line: str, s_row: int, value: str) -> None:
4431 """
4432 Add a token to the results list.
4434 Subclasses could override this method to filter out specific tokens.
4435 """
4436 tok = Token(kind, value)
4437 tok.five_tuple = five_tuple
4438 tok.index = self.token_index
4439 # Bump the token index.
4440 self.token_index += 1
4441 tok.line = line
4442 tok.line_number = s_row
4443 self.results.append(tok)
4444 #@+node:ekr.20191110170551.1: *4* tokenizer.check_results
4445 def check_results(self, contents: str) -> None:
4447 # Split the results into lines.
4448 result = ''.join([z.to_string() for z in self.results])
4449 result_lines = g.splitLines(result)
4450 # Check.
4451 ok = result == contents and result_lines == self.lines
4452 assert ok, (
4453 f"\n"
4454 f" result: {result!r}\n"
4455 f" contents: {contents!r}\n"
4456 f"result_lines: {result_lines}\n"
4457 f" lines: {self.lines}"
4458 )
4459 #@+node:ekr.20191110165235.3: *4* tokenizer.create_input_tokens
4460 def create_input_tokens(self, contents: str, tokens: Generator) -> List["Token"]:
4461 """
4462 Generate a list of Token's from tokens, a list of 5-tuples.
4463 """
4464 # Create the physical lines.
4465 self.lines = contents.splitlines(True)
4466 # Create the list of character offsets of the start of each physical line.
4467 last_offset, self.offsets = 0, [0]
4468 for line in self.lines:
4469 last_offset += len(line)
4470 self.offsets.append(last_offset)
4471 # Handle each token, appending tokens and between-token whitespace to results.
4472 self.prev_offset, self.results = -1, []
4473 for token in tokens:
4474 self.do_token(contents, token)
4475 # Print results when tracing.
4476 self.check_results(contents)
4477 # Return results, as a list.
4478 return self.results
4479 #@+node:ekr.20191110165235.4: *4* tokenizer.do_token (the gem)
4480 header_has_been_shown = False
4482 def do_token(self, contents: str, five_tuple: Any) -> None:
4483 """
4484 Handle the given token, optionally including between-token whitespace.
4486 This is part of the "gem".
4488 Links:
4490 - 11/13/19: ENB: A much better untokenizer
4491 https://groups.google.com/forum/#!msg/leo-editor/DpZ2cMS03WE/VPqtB9lTEAAJ
4493 - Untokenize does not round-trip ws before bs-nl
4494 https://bugs.python.org/issue38663
4495 """
4496 import token as token_module
4497 # Unpack..
4498 tok_type, val, start, end, line = five_tuple
4499 s_row, s_col = start # row/col offsets of start of token.
4500 e_row, e_col = end # row/col offsets of end of token.
4501 kind = token_module.tok_name[tok_type].lower()
4502 # Calculate the token's start/end offsets: character offsets into contents.
4503 s_offset = self.offsets[max(0, s_row - 1)] + s_col
4504 e_offset = self.offsets[max(0, e_row - 1)] + e_col
4505 # tok_s is corresponding string in the line.
4506 tok_s = contents[s_offset:e_offset]
4507 # Add any preceding between-token whitespace.
4508 ws = contents[self.prev_offset:s_offset]
4509 if ws:
4510 # No need for a hook.
4511 self.add_token('ws', five_tuple, line, s_row, ws)
4512 # Always add token, even if it contributes no text!
4513 self.add_token(kind, five_tuple, line, s_row, tok_s)
4514 # Update the ending offset.
4515 self.prev_offset = e_offset
4516 #@-others
4517#@+node:ekr.20191113063144.1: *3* class TokenOrderGenerator
4518class TokenOrderGenerator:
4519 """
4520 A class that traverses ast (parse) trees in token order.
4522 Overview: https://github.com/leo-editor/leo-editor/issues/1440#issue-522090981
4524 Theory of operation:
4525 - https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-573661883
4526 - http://leoeditor.com/appendices.html#tokenorder-classes-theory-of-operation
4528 How to: http://leoeditor.com/appendices.html#tokenorder-class-how-to
4530 Project history: https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-574145510
4531 """
4533 begin_end_stack: List[str] = []
4534 n_nodes = 0 # The number of nodes that have been visited.
4535 node_index = 0 # The index into the node_stack.
4536 node_stack: List[ast.AST] = [] # The stack of parent nodes.
4538 #@+others
4539 #@+node:ekr.20200103174914.1: *4* tog: Init...
4540 #@+node:ekr.20191228184647.1: *5* tog.balance_tokens
4541 def balance_tokens(self, tokens: List["Token"]) -> int:
4542 """
4543 TOG.balance_tokens.
4545 Insert two-way links between matching paren tokens.
4546 """
4547 count, stack = 0, []
4548 for token in tokens:
4549 if token.kind == 'op':
4550 if token.value == '(':
4551 count += 1
4552 stack.append(token.index)
4553 if token.value == ')':
4554 if stack:
4555 index = stack.pop()
4556 tokens[index].matching_paren = token.index
4557 tokens[token.index].matching_paren = index
4558 else: # pragma: no cover
4559 g.trace(f"unmatched ')' at index {token.index}")
4560 if stack: # pragma: no cover
4561 g.trace("unmatched '(' at {','.join(stack)}")
4562 return count
4563 #@+node:ekr.20191113063144.4: *5* tog.create_links
4564 def create_links(self, tokens: List["Token"], tree: Node, file_name: str='') -> List:
4565 """
4566 A generator creates two-way links between the given tokens and ast-tree.
4568 Callers should call this generator with list(tog.create_links(...))
4570 The sync_tokens method creates the links and verifies that the resulting
4571 tree traversal generates exactly the given tokens in exact order.
4573 tokens: the list of Token instances for the input.
4574 Created by make_tokens().
4575 tree: the ast tree for the input.
4576 Created by parse_ast().
4577 """
4578 # Init all ivars.
4579 self.file_name = file_name # For tests.
4580 self.level = 0 # Python indentation level.
4581 self.node = None # The node being visited.
4582 self.tokens = tokens # The immutable list of input tokens.
4583 self.tree = tree # The tree of ast.AST nodes.
4584 # Traverse the tree.
4585 self.visit(tree)
4586 # Ensure that all tokens are patched.
4587 self.node = tree
4588 self.token('endmarker', '')
4589 # Return [] for compatibility with legacy code: list(tog.create_links).
4590 return []
4591 #@+node:ekr.20191229071733.1: *5* tog.init_from_file
4592 def init_from_file(self, filename: str) -> Tuple[str, str, List["Token"], Node]: # pragma: no cover
4593 """
4594 Create the tokens and ast tree for the given file.
4595 Create links between tokens and the parse tree.
4596 Return (contents, encoding, tokens, tree).
4597 """
4598 self.level = 0
4599 self.filename = filename
4600 encoding, contents = read_file_with_encoding(filename)
4601 if not contents:
4602 return None, None, None, None
4603 self.tokens = tokens = make_tokens(contents)
4604 self.tree = tree = parse_ast(contents)
4605 self.create_links(tokens, tree)
4606 return contents, encoding, tokens, tree
4607 #@+node:ekr.20191229071746.1: *5* tog.init_from_string
4608 def init_from_string(self, contents: str, filename: str) -> Tuple[List["Token"], Node]: # pragma: no cover
4609 """
4610 Tokenize, parse and create links in the contents string.
4612 Return (tokens, tree).
4613 """
4614 self.filename = filename
4615 self.level = 0
4616 self.tokens = tokens = make_tokens(contents)
4617 self.tree = tree = parse_ast(contents)
4618 self.create_links(tokens, tree)
4619 return tokens, tree
4620 #@+node:ekr.20220402052020.1: *4* tog: Syncronizers...
4621 # The synchronizer sync tokens to nodes.
4622 #@+node:ekr.20200110162044.1: *5* tog.find_next_significant_token
4623 def find_next_significant_token(self) -> Optional["Token"]:
4624 """
4625 Scan from *after* self.tokens[px] looking for the next significant
4626 token.
4628 Return the token, or None. Never change self.px.
4629 """
4630 px = self.px + 1
4631 while px < len(self.tokens):
4632 token = self.tokens[px]
4633 px += 1
4634 if is_significant_token(token):
4635 return token
4636 # This will never happen, because endtoken is significant.
4637 return None # pragma: no cover
4638 #@+node:ekr.20191125120814.1: *5* tog.set_links
4639 last_statement_node = None
4641 def set_links(self, node: Node, token: "Token") -> None:
4642 """Make two-way links between token and the given node."""
4643 # Don't bother assigning comment, comma, parens, ws and endtoken tokens.
4644 if token.kind == 'comment':
4645 # Append the comment to node.comment_list.
4646 comment_list: List["Token"] = getattr(node, 'comment_list', [])
4647 node.comment_list = comment_list + [token]
4648 return
4649 if token.kind in ('endmarker', 'ws'):
4650 return
4651 if token.kind == 'op' and token.value in ',()':
4652 return
4653 # *Always* remember the last statement.
4654 statement = find_statement_node(node)
4655 if statement:
4656 self.last_statement_node = statement
4657 assert not isinstance(self.last_statement_node, ast.Module)
4658 if token.node is not None: # pragma: no cover
4659 line_s = f"line {token.line_number}:"
4660 raise AssignLinksError(
4661 f" file: {self.filename}\n"
4662 f"{line_s:>12} {token.line.strip()}\n"
4663 f"token index: {self.px}\n"
4664 f"token.node is not None\n"
4665 f" token.node: {token.node.__class__.__name__}\n"
4666 f" callers: {g.callers()}")
4667 # Assign newlines to the previous statement node, if any.
4668 if token.kind in ('newline', 'nl'):
4669 # Set an *auxilliary* link for the split/join logic.
4670 # Do *not* set token.node!
4671 token.statement_node = self.last_statement_node
4672 return
4673 if is_significant_token(token):
4674 # Link the token to the ast node.
4675 token.node = node
4676 # Add the token to node's token_list.
4677 add_token_to_token_list(token, node)
4678 #@+node:ekr.20191124083124.1: *5* tog.sync_name (aka name)
4679 def sync_name(self, val: str) -> None:
4680 aList = val.split('.')
4681 if len(aList) == 1:
4682 self.sync_token('name', val)
4683 else:
4684 for i, part in enumerate(aList):
4685 self.sync_token('name', part)
4686 if i < len(aList) - 1:
4687 self.sync_op('.')
4689 name = sync_name # for readability.
4690 #@+node:ekr.20220402052102.1: *5* tog.sync_op (aka op)
4691 def sync_op(self, val: str) -> None:
4692 """
4693 Sync to the given operator.
4695 val may be '(' or ')' *only* if the parens *will* actually exist in the
4696 token list.
4697 """
4698 self.sync_token('op', val)
4700 op = sync_op # For readability.
4701 #@+node:ekr.20191113063144.7: *5* tog.sync_token (aka token)
4702 px = -1 # Index of the previously synced token.
4704 def sync_token(self, kind: str, val: str) -> None:
4705 """
4706 Sync to a token whose kind & value are given. The token need not be
4707 significant, but it must be guaranteed to exist in the token list.
4709 The checks in this method constitute a strong, ever-present, unit test.
4711 Scan the tokens *after* px, looking for a token T matching (kind, val).
4712 raise AssignLinksError if a significant token is found that doesn't match T.
4713 Otherwise:
4714 - Create two-way links between all assignable tokens between px and T.
4715 - Create two-way links between T and self.node.
4716 - Advance by updating self.px to point to T.
4717 """
4718 node, tokens = self.node, self.tokens
4719 assert isinstance(node, ast.AST), repr(node)
4720 # g.trace(
4721 # f"px: {self.px:2} "
4722 # f"node: {node.__class__.__name__:<10} "
4723 # f"kind: {kind:>10}: val: {val!r}")
4724 #
4725 # Step one: Look for token T.
4726 old_px = px = self.px + 1
4727 while px < len(self.tokens):
4728 token = tokens[px]
4729 if (kind, val) == (token.kind, token.value):
4730 break # Success.
4731 if kind == token.kind == 'number':
4732 val = token.value
4733 break # Benign: use the token's value, a string, instead of a number.
4734 if is_significant_token(token): # pragma: no cover
4735 line_s = f"line {token.line_number}:"
4736 val = str(val) # for g.truncate.
4737 raise AssignLinksError(
4738 f" file: {self.filename}\n"
4739 f"{line_s:>12} {token.line.strip()}\n"
4740 f"Looking for: {kind}.{g.truncate(val, 40)!r}\n"
4741 f" found: {token.kind}.{token.value!r}\n"
4742 f"token.index: {token.index}\n")
4743 # Skip the insignificant token.
4744 px += 1
4745 else: # pragma: no cover
4746 val = str(val) # for g.truncate.
4747 raise AssignLinksError(
4748 f" file: {self.filename}\n"
4749 f"Looking for: {kind}.{g.truncate(val, 40)}\n"
4750 f" found: end of token list")
4751 #
4752 # Step two: Assign *secondary* links only for newline tokens.
4753 # Ignore all other non-significant tokens.
4754 while old_px < px:
4755 token = tokens[old_px]
4756 old_px += 1
4757 if token.kind in ('comment', 'newline', 'nl'):
4758 self.set_links(node, token)
4759 #
4760 # Step three: Set links in the found token.
4761 token = tokens[px]
4762 self.set_links(node, token)
4763 #
4764 # Step four: Advance.
4765 self.px = px
4767 token = sync_token # For readability.
4768 #@+node:ekr.20191223052749.1: *4* tog: Traversal...
4769 #@+node:ekr.20191113063144.3: *5* tog.enter_node
4770 def enter_node(self, node: Node) -> None:
4771 """Enter a node."""
4772 # Update the stats.
4773 self.n_nodes += 1
4774 # Do this first, *before* updating self.node.
4775 node.parent = self.node
4776 if self.node:
4777 children: List[Node] = getattr(self.node, 'children', [])
4778 children.append(node)
4779 self.node.children = children
4780 # Inject the node_index field.
4781 assert not hasattr(node, 'node_index'), g.callers()
4782 node.node_index = self.node_index
4783 self.node_index += 1
4784 # begin_visitor and end_visitor must be paired.
4785 self.begin_end_stack.append(node.__class__.__name__)
4786 # Push the previous node.
4787 self.node_stack.append(self.node)
4788 # Update self.node *last*.
4789 self.node = node
4790 #@+node:ekr.20200104032811.1: *5* tog.leave_node
4791 def leave_node(self, node: Node) -> None:
4792 """Leave a visitor."""
4793 # begin_visitor and end_visitor must be paired.
4794 entry_name = self.begin_end_stack.pop()
4795 assert entry_name == node.__class__.__name__, f"{entry_name!r} {node.__class__.__name__}"
4796 assert self.node == node, (repr(self.node), repr(node))
4797 # Restore self.node.
4798 self.node = self.node_stack.pop()
4799 #@+node:ekr.20191113081443.1: *5* tog.visit
4800 def visit(self, node: Node) -> None:
4801 """Given an ast node, return a *generator* from its visitor."""
4802 # This saves a lot of tests.
4803 if node is None:
4804 return
4805 if 0: # pragma: no cover
4806 # Keep this trace!
4807 cn = node.__class__.__name__ if node else ' '
4808 caller1, caller2 = g.callers(2).split(',')
4809 g.trace(f"{caller1:>15} {caller2:<14} {cn}")
4810 # More general, more convenient.
4811 if isinstance(node, (list, tuple)):
4812 for z in node or []:
4813 if isinstance(z, ast.AST):
4814 self.visit(z)
4815 else: # pragma: no cover
4816 # Some fields may contain ints or strings.
4817 assert isinstance(z, (int, str)), z.__class__.__name__
4818 return
4819 # We *do* want to crash if the visitor doesn't exist.
4820 method = getattr(self, 'do_' + node.__class__.__name__)
4821 # Don't even *think* about removing the parent/child links.
4822 # The nearest_common_ancestor function depends upon them.
4823 self.enter_node(node)
4824 method(node)
4825 self.leave_node(node)
4826 #@+node:ekr.20191113063144.13: *4* tog: Visitors...
4827 #@+node:ekr.20191113063144.32: *5* tog.keyword: not called!
4828 # keyword arguments supplied to call (NULL identifier for **kwargs)
4830 # keyword = (identifier? arg, expr value)
4832 def do_keyword(self, node: Node) -> None: # pragma: no cover
4833 """A keyword arg in an ast.Call."""
4834 # This should never be called.
4835 # tog.hande_call_arguments calls self.visit(kwarg_arg.value) instead.
4836 filename = getattr(self, 'filename', '<no file>')
4837 raise AssignLinksError(
4838 f"file: {filename}\n"
4839 f"do_keyword should never be called\n"
4840 f"{g.callers(8)}")
4841 #@+node:ekr.20191113063144.14: *5* tog: Contexts
4842 #@+node:ekr.20191113063144.28: *6* tog.arg
4843 # arg = (identifier arg, expr? annotation)
4845 def do_arg(self, node: Node) -> None:
4846 """This is one argument of a list of ast.Function or ast.Lambda arguments."""
4847 self.name(node.arg)
4848 annotation = getattr(node, 'annotation', None)
4849 if annotation is not None:
4850 self.op(':')
4851 self.visit(node.annotation)
4852 #@+node:ekr.20191113063144.27: *6* tog.arguments
4853 # arguments = (
4854 # arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs,
4855 # expr* kw_defaults, arg? kwarg, expr* defaults
4856 # )
4858 def do_arguments(self, node: Node) -> None:
4859 """Arguments to ast.Function or ast.Lambda, **not** ast.Call."""
4860 #
4861 # No need to generate commas anywhere below.
4862 #
4863 # Let block. Some fields may not exist pre Python 3.8.
4864 n_plain = len(node.args) - len(node.defaults)
4865 posonlyargs = getattr(node, 'posonlyargs', [])
4866 vararg = getattr(node, 'vararg', None)
4867 kwonlyargs = getattr(node, 'kwonlyargs', [])
4868 kw_defaults = getattr(node, 'kw_defaults', [])
4869 kwarg = getattr(node, 'kwarg', None)
4870 # 1. Sync the position-only args.
4871 if posonlyargs:
4872 for n, z in enumerate(posonlyargs):
4873 # g.trace('pos-only', ast.dump(z))
4874 self.visit(z)
4875 self.op('/')
4876 # 2. Sync all args.
4877 for i, z in enumerate(node.args):
4878 self.visit(z)
4879 if i >= n_plain:
4880 self.op('=')
4881 self.visit(node.defaults[i - n_plain])
4882 # 3. Sync the vararg.
4883 if vararg:
4884 self.op('*')
4885 self.visit(vararg)
4886 # 4. Sync the keyword-only args.
4887 if kwonlyargs:
4888 if not vararg:
4889 self.op('*')
4890 for n, z in enumerate(kwonlyargs):
4891 self.visit(z)
4892 val = kw_defaults[n]
4893 if val is not None:
4894 self.op('=')
4895 self.visit(val)
4896 # 5. Sync the kwarg.
4897 if kwarg:
4898 self.op('**')
4899 self.visit(kwarg)
4900 #@+node:ekr.20191113063144.15: *6* tog.AsyncFunctionDef
4901 # AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list,
4902 # expr? returns)
4904 def do_AsyncFunctionDef(self, node: Node) -> None:
4906 if node.decorator_list:
4907 for z in node.decorator_list:
4908 # '@%s\n'
4909 self.op('@')
4910 self.visit(z)
4911 # 'asynch def (%s): -> %s\n'
4912 # 'asynch def %s(%s):\n'
4913 async_token_type = 'async' if has_async_tokens else 'name'
4914 self.token(async_token_type, 'async')
4915 self.name('def')
4916 self.name(node.name) # A string
4917 self.op('(')
4918 self.visit(node.args)
4919 self.op(')')
4920 returns = getattr(node, 'returns', None)
4921 if returns is not None:
4922 self.op('->')
4923 self.visit(node.returns)
4924 self.op(':')
4925 self.level += 1
4926 self.visit(node.body)
4927 self.level -= 1
4928 #@+node:ekr.20191113063144.16: *6* tog.ClassDef
4929 def do_ClassDef(self, node: Node) -> None:
4931 for z in node.decorator_list or []:
4932 # @{z}\n
4933 self.op('@')
4934 self.visit(z)
4935 # class name(bases):\n
4936 self.name('class')
4937 self.name(node.name) # A string.
4938 if node.bases:
4939 self.op('(')
4940 self.visit(node.bases)
4941 self.op(')')
4942 self.op(':')
4943 # Body...
4944 self.level += 1
4945 self.visit(node.body)
4946 self.level -= 1
4947 #@+node:ekr.20191113063144.17: *6* tog.FunctionDef
4948 # FunctionDef(
4949 # identifier name, arguments args,
4950 # stmt* body,
4951 # expr* decorator_list,
4952 # expr? returns,
4953 # string? type_comment)
4955 def do_FunctionDef(self, node: Node) -> None:
4957 # Guards...
4958 returns = getattr(node, 'returns', None)
4959 # Decorators...
4960 # @{z}\n
4961 for z in node.decorator_list or []:
4962 self.op('@')
4963 self.visit(z)
4964 # Signature...
4965 # def name(args): -> returns\n
4966 # def name(args):\n
4967 self.name('def')
4968 self.name(node.name) # A string.
4969 self.op('(')
4970 self.visit(node.args)
4971 self.op(')')
4972 if returns is not None:
4973 self.op('->')
4974 self.visit(node.returns)
4975 self.op(':')
4976 # Body...
4977 self.level += 1
4978 self.visit(node.body)
4979 self.level -= 1
4980 #@+node:ekr.20191113063144.18: *6* tog.Interactive
4981 def do_Interactive(self, node: Node) -> None: # pragma: no cover
4983 self.visit(node.body)
4984 #@+node:ekr.20191113063144.20: *6* tog.Lambda
4985 def do_Lambda(self, node: Node) -> None:
4987 self.name('lambda')
4988 self.visit(node.args)
4989 self.op(':')
4990 self.visit(node.body)
4991 #@+node:ekr.20191113063144.19: *6* tog.Module
4992 def do_Module(self, node: Node) -> None:
4994 # Encoding is a non-syncing statement.
4995 self.visit(node.body)
4996 #@+node:ekr.20191113063144.21: *5* tog: Expressions
4997 #@+node:ekr.20191113063144.22: *6* tog.Expr
4998 def do_Expr(self, node: Node) -> None:
4999 """An outer expression."""
5000 # No need to put parentheses.
5001 self.visit(node.value)
5002 #@+node:ekr.20191113063144.23: *6* tog.Expression
5003 def do_Expression(self, node: Node) -> None: # pragma: no cover
5004 """An inner expression."""
5005 # No need to put parentheses.
5006 self.visit(node.body)
5007 #@+node:ekr.20191113063144.24: *6* tog.GeneratorExp
5008 def do_GeneratorExp(self, node: Node) -> None:
5010 # '<gen %s for %s>' % (elt, ','.join(gens))
5011 # No need to put parentheses or commas.
5012 self.visit(node.elt)
5013 self.visit(node.generators)
5014 #@+node:ekr.20210321171703.1: *6* tog.NamedExpr
5015 # NamedExpr(expr target, expr value)
5017 def do_NamedExpr(self, node: Node) -> None: # Python 3.8+
5019 self.visit(node.target)
5020 self.op(':=')
5021 self.visit(node.value)
5022 #@+node:ekr.20191113063144.26: *5* tog: Operands
5023 #@+node:ekr.20191113063144.29: *6* tog.Attribute
5024 # Attribute(expr value, identifier attr, expr_context ctx)
5026 def do_Attribute(self, node: Node) -> None:
5028 self.visit(node.value)
5029 self.op('.')
5030 self.name(node.attr) # A string.
5031 #@+node:ekr.20191113063144.30: *6* tog.Bytes
5032 def do_Bytes(self, node: Node) -> None:
5034 """
5035 It's invalid to mix bytes and non-bytes literals, so just
5036 advancing to the next 'string' token suffices.
5037 """
5038 token = self.find_next_significant_token()
5039 self.token('string', token.value)
5040 #@+node:ekr.20191113063144.33: *6* tog.comprehension
5041 # comprehension = (expr target, expr iter, expr* ifs, int is_async)
5043 def do_comprehension(self, node: Node) -> None:
5045 # No need to put parentheses.
5046 self.name('for') # #1858.
5047 self.visit(node.target) # A name
5048 self.name('in')
5049 self.visit(node.iter)
5050 for z in node.ifs or []:
5051 self.name('if')
5052 self.visit(z)
5053 #@+node:ekr.20191113063144.34: *6* tog.Constant
5054 def do_Constant(self, node: Node) -> None: # pragma: no cover
5055 """
5056 https://greentreesnakes.readthedocs.io/en/latest/nodes.html
5058 A constant. The value attribute holds the Python object it represents.
5059 This can be simple types such as a number, string or None, but also
5060 immutable container types (tuples and frozensets) if all of their
5061 elements are constant.
5062 """
5063 # Support Python 3.8.
5064 if node.value is None or isinstance(node.value, bool):
5065 # Weird: return a name!
5066 self.token('name', repr(node.value))
5067 elif node.value == Ellipsis:
5068 self.op('...')
5069 elif isinstance(node.value, str):
5070 self.do_Str(node)
5071 elif isinstance(node.value, (int, float)):
5072 self.token('number', repr(node.value))
5073 elif isinstance(node.value, bytes):
5074 self.do_Bytes(node)
5075 elif isinstance(node.value, tuple):
5076 self.do_Tuple(node)
5077 elif isinstance(node.value, frozenset):
5078 self.do_Set(node)
5079 else:
5080 # Unknown type.
5081 g.trace('----- Oops -----', repr(node.value), g.callers())
5082 #@+node:ekr.20191113063144.35: *6* tog.Dict
5083 # Dict(expr* keys, expr* values)
5085 def do_Dict(self, node: Node) -> None:
5087 assert len(node.keys) == len(node.values)
5088 self.op('{')
5089 # No need to put commas.
5090 for i, key in enumerate(node.keys):
5091 key, value = node.keys[i], node.values[i]
5092 self.visit(key) # a Str node.
5093 self.op(':')
5094 if value is not None:
5095 self.visit(value)
5096 self.op('}')
5097 #@+node:ekr.20191113063144.36: *6* tog.DictComp
5098 # DictComp(expr key, expr value, comprehension* generators)
5100 # d2 = {val: key for key, val in d}
5102 def do_DictComp(self, node: Node) -> None:
5104 self.token('op', '{')
5105 self.visit(node.key)
5106 self.op(':')
5107 self.visit(node.value)
5108 for z in node.generators or []:
5109 self.visit(z)
5110 self.token('op', '}')
5111 #@+node:ekr.20191113063144.37: *6* tog.Ellipsis
5112 def do_Ellipsis(self, node: Node) -> None: # pragma: no cover (Does not exist for python 3.8+)
5114 self.op('...')
5115 #@+node:ekr.20191113063144.38: *6* tog.ExtSlice
5116 # https://docs.python.org/3/reference/expressions.html#slicings
5118 # ExtSlice(slice* dims)
5120 def do_ExtSlice(self, node: Node) -> None: # pragma: no cover (deprecated)
5122 # ','.join(node.dims)
5123 for i, z in enumerate(node.dims):
5124 self.visit(z)
5125 if i < len(node.dims) - 1:
5126 self.op(',')
5127 #@+node:ekr.20191113063144.40: *6* tog.Index
5128 def do_Index(self, node: Node) -> None: # pragma: no cover (deprecated)
5130 self.visit(node.value)
5131 #@+node:ekr.20191113063144.39: *6* tog.FormattedValue: not called!
5132 # FormattedValue(expr value, int? conversion, expr? format_spec)
5134 def do_FormattedValue(self, node: Node) -> None: # pragma: no cover
5135 """
5136 This node represents the *components* of a *single* f-string.
5138 Happily, JoinedStr nodes *also* represent *all* f-strings,
5139 so the TOG should *never visit this node!
5140 """
5141 filename = getattr(self, 'filename', '<no file>')
5142 raise AssignLinksError(
5143 f"file: {filename}\n"
5144 f"do_FormattedValue should never be called")
5146 # This code has no chance of being useful...
5148 # conv = node.conversion
5149 # spec = node.format_spec
5150 # self.visit(node.value)
5151 # if conv is not None:
5152 # self.token('number', conv)
5153 # if spec is not None:
5154 # self.visit(node.format_spec)
5155 #@+node:ekr.20191113063144.41: *6* tog.JoinedStr & helpers
5156 # JoinedStr(expr* values)
5158 def do_JoinedStr(self, node: Node) -> None:
5159 """
5160 JoinedStr nodes represent at least one f-string and all other strings
5161 concatentated to it.
5163 Analyzing JoinedStr.values would be extremely tricky, for reasons that
5164 need not be explained here.
5166 Instead, we get the tokens *from the token list itself*!
5167 """
5168 for z in self.get_concatenated_string_tokens():
5169 self.token(z.kind, z.value)
5170 #@+node:ekr.20191113063144.42: *6* tog.List
5171 def do_List(self, node: Node) -> None:
5173 # No need to put commas.
5174 self.op('[')
5175 self.visit(node.elts)
5176 self.op(']')
5177 #@+node:ekr.20191113063144.43: *6* tog.ListComp
5178 # ListComp(expr elt, comprehension* generators)
5180 def do_ListComp(self, node: Node) -> None:
5182 self.op('[')
5183 self.visit(node.elt)
5184 for z in node.generators:
5185 self.visit(z)
5186 self.op(']')
5187 #@+node:ekr.20191113063144.44: *6* tog.Name & NameConstant
5188 def do_Name(self, node: Node) -> None:
5190 self.name(node.id)
5192 def do_NameConstant(self, node: Node) -> None: # pragma: no cover (Does not exist in Python 3.8+)
5194 self.name(repr(node.value))
5196 #@+node:ekr.20191113063144.45: *6* tog.Num
5197 def do_Num(self, node: Node) -> None: # pragma: no cover (Does not exist in Python 3.8+)
5199 self.token('number', node.n)
5200 #@+node:ekr.20191113063144.47: *6* tog.Set
5201 # Set(expr* elts)
5203 def do_Set(self, node: Node) -> None:
5205 self.op('{')
5206 self.visit(node.elts)
5207 self.op('}')
5208 #@+node:ekr.20191113063144.48: *6* tog.SetComp
5209 # SetComp(expr elt, comprehension* generators)
5211 def do_SetComp(self, node: Node) -> None:
5213 self.op('{')
5214 self.visit(node.elt)
5215 for z in node.generators or []:
5216 self.visit(z)
5217 self.op('}')
5218 #@+node:ekr.20191113063144.49: *6* tog.Slice
5219 # slice = Slice(expr? lower, expr? upper, expr? step)
5221 def do_Slice(self, node: Node) -> None:
5223 lower = getattr(node, 'lower', None)
5224 upper = getattr(node, 'upper', None)
5225 step = getattr(node, 'step', None)
5226 if lower is not None:
5227 self.visit(lower)
5228 # Always put the colon between upper and lower.
5229 self.op(':')
5230 if upper is not None:
5231 self.visit(upper)
5232 # Put the second colon if it exists in the token list.
5233 if step is None:
5234 token = self.find_next_significant_token()
5235 if token and token.value == ':':
5236 self.op(':')
5237 else:
5238 self.op(':')
5239 self.visit(step)
5240 #@+node:ekr.20191113063144.50: *6* tog.Str & helper
5241 def do_Str(self, node: Node) -> None:
5242 """This node represents a string constant."""
5243 # This loop is necessary to handle string concatenation.
5244 for z in self.get_concatenated_string_tokens():
5245 self.token(z.kind, z.value)
5246 #@+node:ekr.20200111083914.1: *7* tog.get_concatenated_tokens
5247 def get_concatenated_string_tokens(self) -> List["Token"]:
5248 """
5249 Return the next 'string' token and all 'string' tokens concatenated to
5250 it. *Never* update self.px here.
5251 """
5252 trace = False
5253 tag = 'tog.get_concatenated_string_tokens'
5254 i = self.px
5255 # First, find the next significant token. It should be a string.
5256 i, token = i + 1, None
5257 while i < len(self.tokens):
5258 token = self.tokens[i]
5259 i += 1
5260 if token.kind == 'string':
5261 # Rescan the string.
5262 i -= 1
5263 break
5264 # An error.
5265 if is_significant_token(token): # pragma: no cover
5266 break
5267 # Raise an error if we didn't find the expected 'string' token.
5268 if not token or token.kind != 'string': # pragma: no cover
5269 if not token:
5270 token = self.tokens[-1]
5271 filename = getattr(self, 'filename', '<no filename>')
5272 raise AssignLinksError(
5273 f"\n"
5274 f"{tag}...\n"
5275 f"file: {filename}\n"
5276 f"line: {token.line_number}\n"
5277 f" i: {i}\n"
5278 f"expected 'string' token, got {token!s}")
5279 # Accumulate string tokens.
5280 assert self.tokens[i].kind == 'string'
5281 results = []
5282 while i < len(self.tokens):
5283 token = self.tokens[i]
5284 i += 1
5285 if token.kind == 'string':
5286 results.append(token)
5287 elif token.kind == 'op' or is_significant_token(token):
5288 # Any significant token *or* any op will halt string concatenation.
5289 break
5290 # 'ws', 'nl', 'newline', 'comment', 'indent', 'dedent', etc.
5291 # The (significant) 'endmarker' token ensures we will have result.
5292 assert results
5293 if trace: # pragma: no cover
5294 g.printObj(results, tag=f"{tag}: Results")
5295 return results
5296 #@+node:ekr.20191113063144.51: *6* tog.Subscript
5297 # Subscript(expr value, slice slice, expr_context ctx)
5299 def do_Subscript(self, node: Node) -> None:
5301 self.visit(node.value)
5302 self.op('[')
5303 self.visit(node.slice)
5304 self.op(']')
5305 #@+node:ekr.20191113063144.52: *6* tog.Tuple
5306 # Tuple(expr* elts, expr_context ctx)
5308 def do_Tuple(self, node: Node) -> None:
5310 # Do not call op for parens or commas here.
5311 # They do not necessarily exist in the token list!
5312 self.visit(node.elts)
5313 #@+node:ekr.20191113063144.53: *5* tog: Operators
5314 #@+node:ekr.20191113063144.55: *6* tog.BinOp
5315 def do_BinOp(self, node: Node) -> None:
5317 op_name_ = op_name(node.op)
5318 self.visit(node.left)
5319 self.op(op_name_)
5320 self.visit(node.right)
5321 #@+node:ekr.20191113063144.56: *6* tog.BoolOp
5322 # BoolOp(boolop op, expr* values)
5324 def do_BoolOp(self, node: Node) -> None:
5326 # op.join(node.values)
5327 op_name_ = op_name(node.op)
5328 for i, z in enumerate(node.values):
5329 self.visit(z)
5330 if i < len(node.values) - 1:
5331 self.name(op_name_)
5332 #@+node:ekr.20191113063144.57: *6* tog.Compare
5333 # Compare(expr left, cmpop* ops, expr* comparators)
5335 def do_Compare(self, node: Node) -> None:
5337 assert len(node.ops) == len(node.comparators)
5338 self.visit(node.left)
5339 for i, z in enumerate(node.ops):
5340 op_name_ = op_name(node.ops[i])
5341 if op_name_ in ('not in', 'is not'):
5342 for z in op_name_.split(' '):
5343 self.name(z)
5344 elif op_name_.isalpha():
5345 self.name(op_name_)
5346 else:
5347 self.op(op_name_)
5348 self.visit(node.comparators[i])
5349 #@+node:ekr.20191113063144.58: *6* tog.UnaryOp
5350 def do_UnaryOp(self, node: Node) -> None:
5352 op_name_ = op_name(node.op)
5353 if op_name_.isalpha():
5354 self.name(op_name_)
5355 else:
5356 self.op(op_name_)
5357 self.visit(node.operand)
5358 #@+node:ekr.20191113063144.59: *6* tog.IfExp (ternary operator)
5359 # IfExp(expr test, expr body, expr orelse)
5361 def do_IfExp(self, node: Node) -> None:
5363 #'%s if %s else %s'
5364 self.visit(node.body)
5365 self.name('if')
5366 self.visit(node.test)
5367 self.name('else')
5368 self.visit(node.orelse)
5369 #@+node:ekr.20191113063144.60: *5* tog: Statements
5370 #@+node:ekr.20191113063144.83: *6* tog.Starred
5371 # Starred(expr value, expr_context ctx)
5373 def do_Starred(self, node: Node) -> None:
5374 """A starred argument to an ast.Call"""
5375 self.op('*')
5376 self.visit(node.value)
5377 #@+node:ekr.20191113063144.61: *6* tog.AnnAssign
5378 # AnnAssign(expr target, expr annotation, expr? value, int simple)
5380 def do_AnnAssign(self, node: Node) -> None:
5382 # {node.target}:{node.annotation}={node.value}\n'
5383 self.visit(node.target)
5384 self.op(':')
5385 self.visit(node.annotation)
5386 if node.value is not None: # #1851
5387 self.op('=')
5388 self.visit(node.value)
5389 #@+node:ekr.20191113063144.62: *6* tog.Assert
5390 # Assert(expr test, expr? msg)
5392 def do_Assert(self, node: Node) -> None:
5394 # Guards...
5395 msg = getattr(node, 'msg', None)
5396 # No need to put parentheses or commas.
5397 self.name('assert')
5398 self.visit(node.test)
5399 if msg is not None:
5400 self.visit(node.msg)
5401 #@+node:ekr.20191113063144.63: *6* tog.Assign
5402 def do_Assign(self, node: Node) -> None:
5404 for z in node.targets:
5405 self.visit(z)
5406 self.op('=')
5407 self.visit(node.value)
5408 #@+node:ekr.20191113063144.64: *6* tog.AsyncFor
5409 def do_AsyncFor(self, node: Node) -> None:
5411 # The def line...
5412 # Py 3.8 changes the kind of token.
5413 async_token_type = 'async' if has_async_tokens else 'name'
5414 self.token(async_token_type, 'async')
5415 self.name('for')
5416 self.visit(node.target)
5417 self.name('in')
5418 self.visit(node.iter)
5419 self.op(':')
5420 # Body...
5421 self.level += 1
5422 self.visit(node.body)
5423 # Else clause...
5424 if node.orelse:
5425 self.name('else')
5426 self.op(':')
5427 self.visit(node.orelse)
5428 self.level -= 1
5429 #@+node:ekr.20191113063144.65: *6* tog.AsyncWith
5430 def do_AsyncWith(self, node: Node) -> None:
5432 async_token_type = 'async' if has_async_tokens else 'name'
5433 self.token(async_token_type, 'async')
5434 self.do_With(node)
5435 #@+node:ekr.20191113063144.66: *6* tog.AugAssign
5436 # AugAssign(expr target, operator op, expr value)
5438 def do_AugAssign(self, node: Node) -> None:
5440 # %s%s=%s\n'
5441 op_name_ = op_name(node.op)
5442 self.visit(node.target)
5443 self.op(op_name_ + '=')
5444 self.visit(node.value)
5445 #@+node:ekr.20191113063144.67: *6* tog.Await
5446 # Await(expr value)
5448 def do_Await(self, node: Node) -> None:
5450 #'await %s\n'
5451 async_token_type = 'await' if has_async_tokens else 'name'
5452 self.token(async_token_type, 'await')
5453 self.visit(node.value)
5454 #@+node:ekr.20191113063144.68: *6* tog.Break
5455 def do_Break(self, node: Node) -> None:
5457 self.name('break')
5458 #@+node:ekr.20191113063144.31: *6* tog.Call & helpers
5459 # Call(expr func, expr* args, keyword* keywords)
5461 # Python 3 ast.Call nodes do not have 'starargs' or 'kwargs' fields.
5463 def do_Call(self, node: Node) -> None:
5465 # The calls to op(')') and op('(') do nothing by default.
5466 # Subclasses might handle them in an overridden tog.set_links.
5467 self.visit(node.func)
5468 self.op('(')
5469 # No need to generate any commas.
5470 self.handle_call_arguments(node)
5471 self.op(')')
5472 #@+node:ekr.20191204114930.1: *7* tog.arg_helper
5473 def arg_helper(self, node: Union[Node, str]) -> None:
5474 """
5475 Yield the node, with a special case for strings.
5476 """
5477 if isinstance(node, str):
5478 self.token('name', node)
5479 else:
5480 self.visit(node)
5481 #@+node:ekr.20191204105506.1: *7* tog.handle_call_arguments
5482 def handle_call_arguments(self, node: Node) -> None:
5483 """
5484 Generate arguments in the correct order.
5486 Call(expr func, expr* args, keyword* keywords)
5488 https://docs.python.org/3/reference/expressions.html#calls
5490 Warning: This code will fail on Python 3.8 only for calls
5491 containing kwargs in unexpected places.
5492 """
5493 # *args: in node.args[]: Starred(value=Name(id='args'))
5494 # *[a, 3]: in node.args[]: Starred(value=List(elts=[Name(id='a'), Num(n=3)])
5495 # **kwargs: in node.keywords[]: keyword(arg=None, value=Name(id='kwargs'))
5496 #
5497 # Scan args for *name or *List
5498 args = node.args or []
5499 keywords = node.keywords or []
5501 def get_pos(obj: Any) -> Tuple[int, int, Any]:
5502 line1 = getattr(obj, 'lineno', None)
5503 col1 = getattr(obj, 'col_offset', None)
5504 return line1, col1, obj
5506 def sort_key(aTuple: Tuple) -> int:
5507 line, col, obj = aTuple
5508 return line * 1000 + col
5510 if 0: # pragma: no cover
5511 g.printObj([ast.dump(z) for z in args], tag='args')
5512 g.printObj([ast.dump(z) for z in keywords], tag='keywords')
5514 if py_version >= (3, 9):
5515 places = [get_pos(z) for z in args + keywords]
5516 places.sort(key=sort_key)
5517 ordered_args = [z[2] for z in places]
5518 for z in ordered_args:
5519 if isinstance(z, ast.Starred):
5520 self.op('*')
5521 self.visit(z.value)
5522 elif isinstance(z, ast.keyword):
5523 if getattr(z, 'arg', None) is None:
5524 self.op('**')
5525 self.arg_helper(z.value)
5526 else:
5527 self.arg_helper(z.arg)
5528 self.op('=')
5529 self.arg_helper(z.value)
5530 else:
5531 self.arg_helper(z)
5532 else: # pragma: no cover
5533 #
5534 # Legacy code: May fail for Python 3.8
5535 #
5536 # Scan args for *arg and *[...]
5537 kwarg_arg = star_arg = None
5538 for z in args:
5539 if isinstance(z, ast.Starred):
5540 if isinstance(z.value, ast.Name): # *Name.
5541 star_arg = z
5542 args.remove(z)
5543 break
5544 elif isinstance(z.value, (ast.List, ast.Tuple)): # *[...]
5545 # star_list = z
5546 break
5547 raise AttributeError(f"Invalid * expression: {ast.dump(z)}") # pragma: no cover
5548 # Scan keywords for **name.
5549 for z in keywords:
5550 if hasattr(z, 'arg') and z.arg is None:
5551 kwarg_arg = z
5552 keywords.remove(z)
5553 break
5554 # Sync the plain arguments.
5555 for z in args:
5556 self.arg_helper(z)
5557 # Sync the keyword args.
5558 for z in keywords:
5559 self.arg_helper(z.arg)
5560 self.op('=')
5561 self.arg_helper(z.value)
5562 # Sync the * arg.
5563 if star_arg:
5564 self.arg_helper(star_arg)
5565 # Sync the ** kwarg.
5566 if kwarg_arg:
5567 self.op('**')
5568 self.visit(kwarg_arg.value)
5569 #@+node:ekr.20191113063144.69: *6* tog.Continue
5570 def do_Continue(self, node: Node) -> None:
5572 self.name('continue')
5573 #@+node:ekr.20191113063144.70: *6* tog.Delete
5574 def do_Delete(self, node: Node) -> None:
5576 # No need to put commas.
5577 self.name('del')
5578 self.visit(node.targets)
5579 #@+node:ekr.20191113063144.71: *6* tog.ExceptHandler
5580 def do_ExceptHandler(self, node: Node) -> None:
5582 # Except line...
5583 self.name('except')
5584 if getattr(node, 'type', None):
5585 self.visit(node.type)
5586 if getattr(node, 'name', None):
5587 self.name('as')
5588 self.name(node.name)
5589 self.op(':')
5590 # Body...
5591 self.level += 1
5592 self.visit(node.body)
5593 self.level -= 1
5594 #@+node:ekr.20191113063144.73: *6* tog.For
5595 def do_For(self, node: Node) -> None:
5597 # The def line...
5598 self.name('for')
5599 self.visit(node.target)
5600 self.name('in')
5601 self.visit(node.iter)
5602 self.op(':')
5603 # Body...
5604 self.level += 1
5605 self.visit(node.body)
5606 # Else clause...
5607 if node.orelse:
5608 self.name('else')
5609 self.op(':')
5610 self.visit(node.orelse)
5611 self.level -= 1
5612 #@+node:ekr.20191113063144.74: *6* tog.Global
5613 # Global(identifier* names)
5615 def do_Global(self, node: Node) -> None:
5617 self.name('global')
5618 for z in node.names:
5619 self.name(z)
5620 #@+node:ekr.20191113063144.75: *6* tog.If & helpers
5621 # If(expr test, stmt* body, stmt* orelse)
5623 def do_If(self, node: Node) -> None:
5624 #@+<< do_If docstring >>
5625 #@+node:ekr.20191122222412.1: *7* << do_If docstring >>
5626 """
5627 The parse trees for the following are identical!
5629 if 1: if 1:
5630 pass pass
5631 else: elif 2:
5632 if 2: pass
5633 pass
5635 So there is *no* way for the 'if' visitor to disambiguate the above two
5636 cases from the parse tree alone.
5638 Instead, we scan the tokens list for the next 'if', 'else' or 'elif' token.
5639 """
5640 #@-<< do_If docstring >>
5641 # Use the next significant token to distinguish between 'if' and 'elif'.
5642 token = self.find_next_significant_token()
5643 self.name(token.value)
5644 self.visit(node.test)
5645 self.op(':')
5646 #
5647 # Body...
5648 self.level += 1
5649 self.visit(node.body)
5650 self.level -= 1
5651 #
5652 # Else and elif clauses...
5653 if node.orelse:
5654 self.level += 1
5655 token = self.find_next_significant_token()
5656 if token.value == 'else':
5657 self.name('else')
5658 self.op(':')
5659 self.visit(node.orelse)
5660 else:
5661 self.visit(node.orelse)
5662 self.level -= 1
5663 #@+node:ekr.20191113063144.76: *6* tog.Import & helper
5664 def do_Import(self, node: Node) -> None:
5666 self.name('import')
5667 for alias in node.names:
5668 self.name(alias.name)
5669 if alias.asname:
5670 self.name('as')
5671 self.name(alias.asname)
5672 #@+node:ekr.20191113063144.77: *6* tog.ImportFrom
5673 # ImportFrom(identifier? module, alias* names, int? level)
5675 def do_ImportFrom(self, node: Node) -> None:
5677 self.name('from')
5678 for i in range(node.level):
5679 self.op('.')
5680 if node.module:
5681 self.name(node.module)
5682 self.name('import')
5683 # No need to put commas.
5684 for alias in node.names:
5685 if alias.name == '*': # #1851.
5686 self.op('*')
5687 else:
5688 self.name(alias.name)
5689 if alias.asname:
5690 self.name('as')
5691 self.name(alias.asname)
5692 #@+node:ekr.20220401034726.1: *6* tog.Match* (Python 3.10+)
5693 # Match(expr subject, match_case* cases)
5695 # match_case = (pattern pattern, expr? guard, stmt* body)
5697 # Full syntax diagram: # https://peps.python.org/pep-0634/#appendix-a
5699 def do_Match(self, node: Node) -> None:
5701 cases = getattr(node, 'cases', [])
5702 self.name('match')
5703 self.visit(node.subject)
5704 self.op(':')
5705 for case in cases:
5706 self.visit(case)
5707 #@+node:ekr.20220401034726.2: *7* tog.match_case
5708 # match_case = (pattern pattern, expr? guard, stmt* body)
5710 def do_match_case(self, node: Node) -> None:
5712 guard = getattr(node, 'guard', None)
5713 body = getattr(node, 'body', [])
5714 self.name('case')
5715 self.visit(node.pattern)
5716 if guard:
5717 self.name('if')
5718 self.visit(guard)
5719 self.op(':')
5720 for statement in body:
5721 self.visit(statement)
5722 #@+node:ekr.20220401034726.3: *7* tog.MatchAs
5723 # MatchAs(pattern? pattern, identifier? name)
5725 def do_MatchAs(self, node: Node) -> None:
5726 pattern = getattr(node, 'pattern', None)
5727 name = getattr(node, 'name', None)
5728 if pattern and name:
5729 self.visit(pattern)
5730 self.name('as')
5731 self.name(name)
5732 elif pattern:
5733 self.visit(pattern) # pragma: no cover
5734 else:
5735 self.name(name or '_')
5736 #@+node:ekr.20220401034726.4: *7* tog.MatchClass
5737 # MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns)
5739 def do_MatchClass(self, node: Node) -> None:
5741 patterns = getattr(node, 'patterns', [])
5742 kwd_attrs = getattr(node, 'kwd_attrs', [])
5743 kwd_patterns = getattr(node, 'kwd_patterns', [])
5744 self.visit(node.cls)
5745 self.op('(')
5746 for pattern in patterns:
5747 self.visit(pattern)
5748 for i, kwd_attr in enumerate(kwd_attrs):
5749 self.name(kwd_attr) # a String.
5750 self.op('=')
5751 self.visit(kwd_patterns[i])
5752 self.op(')')
5753 #@+node:ekr.20220401034726.5: *7* tog.MatchMapping
5754 # MatchMapping(expr* keys, pattern* patterns, identifier? rest)
5756 def do_MatchMapping(self, node: Node) -> None:
5757 keys = getattr(node, 'keys', [])
5758 patterns = getattr(node, 'patterns', [])
5759 rest = getattr(node, 'rest', None)
5760 self.op('{')
5761 for i, key in enumerate(keys):
5762 self.visit(key)
5763 self.op(':')
5764 self.visit(patterns[i])
5765 if rest:
5766 self.op('**')
5767 self.name(rest) # A string.
5768 self.op('}')
5769 #@+node:ekr.20220401034726.6: *7* tog.MatchOr
5770 # MatchOr(pattern* patterns)
5772 def do_MatchOr(self, node: Node) -> None:
5773 patterns = getattr(node, 'patterns', [])
5774 for i, pattern in enumerate(patterns):
5775 if i > 0:
5776 self.op('|')
5777 self.visit(pattern)
5778 #@+node:ekr.20220401034726.7: *7* tog.MatchSequence
5779 # MatchSequence(pattern* patterns)
5781 def do_MatchSequence(self, node: Node) -> None:
5782 patterns = getattr(node, 'patterns', [])
5783 # Scan for the next '(' or '[' token, skipping the 'case' token.
5784 token = None
5785 for token in self.tokens[self.px + 1 :]:
5786 if token.kind == 'op' and token.value in '([':
5787 break
5788 if is_significant_token(token):
5789 # An implicit tuple: there is no '(' or '[' token.
5790 token = None
5791 break
5792 else:
5793 raise AssignLinksError('Ill-formed tuple') # pragma: no cover
5794 if token:
5795 self.op(token.value)
5796 for i, pattern in enumerate(patterns):
5797 self.visit(pattern)
5798 if token:
5799 self.op(']' if token.value == '[' else ')')
5800 #@+node:ekr.20220401034726.8: *7* tog.MatchSingleton
5801 # MatchSingleton(constant value)
5803 def do_MatchSingleton(self, node: Node) -> None:
5804 """Match True, False or None."""
5805 # g.trace(repr(node.value))
5806 self.token('name', repr(node.value))
5807 #@+node:ekr.20220401034726.9: *7* tog.MatchStar
5808 # MatchStar(identifier? name)
5810 def do_MatchStar(self, node: Node) -> None:
5811 name = getattr(node, 'name', None)
5812 self.op('*')
5813 if name:
5814 self.name(name)
5815 #@+node:ekr.20220401034726.10: *7* tog.MatchValue
5816 # MatchValue(expr value)
5818 def do_MatchValue(self, node: Node) -> None:
5820 self.visit(node.value)
5821 #@+node:ekr.20191113063144.78: *6* tog.Nonlocal
5822 # Nonlocal(identifier* names)
5824 def do_Nonlocal(self, node: Node) -> None:
5826 # nonlocal %s\n' % ','.join(node.names))
5827 # No need to put commas.
5828 self.name('nonlocal')
5829 for z in node.names:
5830 self.name(z)
5831 #@+node:ekr.20191113063144.79: *6* tog.Pass
5832 def do_Pass(self, node: Node) -> None:
5834 self.name('pass')
5835 #@+node:ekr.20191113063144.81: *6* tog.Raise
5836 # Raise(expr? exc, expr? cause)
5838 def do_Raise(self, node: Node) -> None:
5840 # No need to put commas.
5841 self.name('raise')
5842 exc = getattr(node, 'exc', None)
5843 cause = getattr(node, 'cause', None)
5844 tback = getattr(node, 'tback', None)
5845 self.visit(exc)
5846 if cause:
5847 self.name('from') # #2446.
5848 self.visit(cause)
5849 self.visit(tback)
5850 #@+node:ekr.20191113063144.82: *6* tog.Return
5851 def do_Return(self, node: Node) -> None:
5853 self.name('return')
5854 self.visit(node.value)
5855 #@+node:ekr.20191113063144.85: *6* tog.Try
5856 # Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
5858 def do_Try(self, node: Node) -> None:
5860 # Try line...
5861 self.name('try')
5862 self.op(':')
5863 # Body...
5864 self.level += 1
5865 self.visit(node.body)
5866 self.visit(node.handlers)
5867 # Else...
5868 if node.orelse:
5869 self.name('else')
5870 self.op(':')
5871 self.visit(node.orelse)
5872 # Finally...
5873 if node.finalbody:
5874 self.name('finally')
5875 self.op(':')
5876 self.visit(node.finalbody)
5877 self.level -= 1
5878 #@+node:ekr.20191113063144.88: *6* tog.While
5879 def do_While(self, node: Node) -> None:
5881 # While line...
5882 # while %s:\n'
5883 self.name('while')
5884 self.visit(node.test)
5885 self.op(':')
5886 # Body...
5887 self.level += 1
5888 self.visit(node.body)
5889 # Else clause...
5890 if node.orelse:
5891 self.name('else')
5892 self.op(':')
5893 self.visit(node.orelse)
5894 self.level -= 1
5895 #@+node:ekr.20191113063144.89: *6* tog.With
5896 # With(withitem* items, stmt* body)
5898 # withitem = (expr context_expr, expr? optional_vars)
5900 def do_With(self, node: Node) -> None:
5902 expr: Optional[ast.AST] = getattr(node, 'context_expression', None)
5903 items: List[ast.AST] = getattr(node, 'items', [])
5904 self.name('with')
5905 self.visit(expr)
5906 # No need to put commas.
5907 for item in items:
5908 self.visit(item.context_expr)
5909 optional_vars = getattr(item, 'optional_vars', None)
5910 if optional_vars is not None:
5911 self.name('as')
5912 self.visit(item.optional_vars)
5913 # End the line.
5914 self.op(':')
5915 # Body...
5916 self.level += 1
5917 self.visit(node.body)
5918 self.level -= 1
5919 #@+node:ekr.20191113063144.90: *6* tog.Yield
5920 def do_Yield(self, node: Node) -> None:
5922 self.name('yield')
5923 if hasattr(node, 'value'):
5924 self.visit(node.value)
5925 #@+node:ekr.20191113063144.91: *6* tog.YieldFrom
5926 # YieldFrom(expr value)
5928 def do_YieldFrom(self, node: Node) -> None:
5930 self.name('yield')
5931 self.name('from')
5932 self.visit(node.value)
5933 #@-others
5934#@+node:ekr.20191226195813.1: *3* class TokenOrderTraverser
5935class TokenOrderTraverser:
5936 """
5937 Traverse an ast tree using the parent/child links created by the
5938 TokenOrderGenerator class.
5940 **Important**:
5942 This class is a curio. It is no longer used in this file!
5943 The Fstringify and ReassignTokens classes now use ast.walk.
5944 """
5945 #@+others
5946 #@+node:ekr.20191226200154.1: *4* TOT.traverse
5947 def traverse(self, tree: Node) -> int:
5948 """
5949 Call visit, in token order, for all nodes in tree.
5951 Recursion is not allowed.
5953 The code follows p.moveToThreadNext exactly.
5954 """
5956 def has_next(i: int, node: Node, stack: List[int]) -> bool:
5957 """Return True if stack[i] is a valid child of node.parent."""
5958 # g.trace(node.__class__.__name__, stack)
5959 parent = node.parent
5960 return bool(parent and parent.children and i < len(parent.children))
5962 # Update stats
5964 self.last_node_index = -1 # For visit
5965 # The stack contains child indices.
5966 node, stack = tree, [0]
5967 seen = set()
5968 while node and stack:
5969 # g.trace(
5970 # f"{node.node_index:>3} "
5971 # f"{node.__class__.__name__:<12} {stack}")
5972 # Visit the node.
5973 assert node.node_index not in seen, node.node_index
5974 seen.add(node.node_index)
5975 self.visit(node)
5976 # if p.v.children: p.moveToFirstChild()
5977 children: List[ast.AST] = getattr(node, 'children', [])
5978 if children:
5979 # Move to the first child.
5980 stack.append(0)
5981 node = children[0]
5982 # g.trace(' child:', node.__class__.__name__, stack)
5983 continue
5984 # elif p.hasNext(): p.moveToNext()
5985 stack[-1] += 1
5986 i = stack[-1]
5987 if has_next(i, node, stack):
5988 node = node.parent.children[i]
5989 continue
5990 # else...
5991 # p.moveToParent()
5992 node = node.parent
5993 stack.pop()
5994 # while p:
5995 while node and stack:
5996 # if p.hasNext():
5997 stack[-1] += 1
5998 i = stack[-1]
5999 if has_next(i, node, stack):
6000 # Move to the next sibling.
6001 node = node.parent.children[i]
6002 break # Found.
6003 # p.moveToParent()
6004 node = node.parent
6005 stack.pop()
6006 # not found.
6007 else:
6008 break # pragma: no cover
6009 return self.last_node_index
6010 #@+node:ekr.20191227160547.1: *4* TOT.visit
6011 def visit(self, node: Node) -> None:
6013 self.last_node_index += 1
6014 assert self.last_node_index == node.node_index, (
6015 self.last_node_index, node.node_index)
6016 #@-others
6017#@-others
6018g = LeoGlobals()
6019if __name__ == '__main__':
6020 main() # pragma: no cover
6021#@@language python
6022#@@tabwidth -4
6023#@@pagewidth 70
6024#@-leo