Re: Extract data from file

 new new list compose Reply to this message Top page
Attachments:
+ (text/plain)

Delete this message
Author: Kaz Kylheku
Date:  
To: Roger Mason
CC: txr-users
Subject: Re: Extract data from file
On 2023-04-25 04:58, Roger Mason wrote:
> @(collect)
> @ (some)
> @spsymb : spsymb
> @spname : spname
> @spzn : spzn
> @spmass : spmass
> @rminsp @rmt @rmaxsp @ nrmt : rminsp, rmt, rmaxsp, nrmt
> @ (end)
> @(end)
> @(output)
> spsymb spname spzn spmass rminsp rmt rmaxsp nrmt
> @spsymb @spname @spzn @spmass @rminsp @rmt @rmaxsp @nrmt
> @(end)
>
> but I don't see how to remove the unwanted spaces after spzn.


Hi Roger,

The following diff to your code produces output identical to the
expected output:

diff --git a/Si_symb_etc.txr b/Si_symb_etc.txr
index ae49486..d8a4350 100644
--- a/Si_symb_etc.txr
+++ b/Si_symb_etc.txr
@@ -2,12 +2,12 @@
@ (some)
@spsymb : spsymb
@spname : spname
- @spzn : spzn
+ @spzn : spzn
@spmass : spmass
@rminsp @rmt @rmaxsp @ nrmt : rminsp, rmt, rmaxsp, nrmt
@ (end)
@(end)
@(output)
- spsymb spname spzn spmass rminsp rmt rmaxsp nrmt
+spsymb spname spzn spmass rminsp rmt rmaxsp nrmt
@spsymb @spname @spzn @spmass @rminsp @rmt @rmaxsp @nrmt
@(end)

As shown at the Bash prompt:

$ diff <(txr Si_symb_etc.txr Si.in) Si_symb_etc.txt
[no output]

If you dump the bindings with -B, you can see which variable is getting the
extra spaces. Ths is with the original code:

$ txr -B Si_symb_etc.txr Si.in
 spsymb spname spzn spmass rminsp rmt rmaxsp nrmt
 'Si' 'silicon' -14.0000                                 51196.73454 0.534522E-06 2.2000 47.8169 400
spsymb[0]="'Si'"
spname[0]="'silicon'"
spzn[0]="-14.0000                                "
spmass[0]="51196.73454"
rminsp[0]="0.534522E-06"
rmt[0]="2.2000"
rmaxsp[0]="47.8169"
nrmt[0]="400"


You had two spaces after @spzn, which get treated literally. A single space
token in TXR is slightly magic: it matches multiple spaces. A two-space
token, like between @a  @b is literal.  So @a @b can match "foo     bar"
but @a  @b will only "foo  bar" and not "foo bar" or "foo   bar".



> For the other two outputs I don't see how to isolate the wanted parts of
> the input file. I am using TXR 283.


I am looking at the other two expected outputs now.

Cheers ...