 |
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
Notquitenewton
Joined: 25 May 2021 Posts: 20 Location: England, UK
|
Posted: Fri Jan 27, 2023 5:52 pm Post subject: Tab separated fields |
|
|
I have a row of text. In this row there are subtitles containing a space and each subtitle is tab separated from another subtitle. I've roughly coded this (below) as an example and it is getting mixed up when the subtitle width including the space is about 8 which I think is the default tab. There is probably a much better way of doing this but I can't see it.
Example...
title space 1 tab titleabcd space 2 tab next title and so on.
I'm copying a row from Excel into a text file and it is tab separated with 1 or 2 spaces in each cell. A bit puzzled at present. The subtitle can be anywhere between 1 and 20 with a couple of spaces. Any help welcomed.
IMPLICIT NONE
INCLUDE <Windows.ins> , nolist
INTEGER i, j, jj
INTEGER (KIND=2) kk
CHARACTER*1000 numstr
CHARACTER*20 title(50)
numstr = 'titleab 1 titleabc 2 titleabcd 3 titleabcde 4
& titleabcdef 5 titleabcdefg 6 titleabcdefgh 7
& titleabcdefghi 8 titleabcdefghij 9'
CALL compress@(numstr,kk)
CALL trim@(numstr)
do i = 1, 50
CALL trim@(numstr)
jj = leng(numstr)
j = index(numstr(1:jj),CHAR(9))
if (j .ne. 0) then
title(i) = numstr(1:j-1)
print *, title(i)
numstr = numstr(j+1:jj)
endif
enddo
end |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1840
|
Posted: Sat Jan 28, 2023 1:09 am Post subject: |
|
|
One of the undesirable consequences of using fixed size strings is having to contend with a lot of padding.
Does the following code satisfy your needs?
Code: | program xyz
implicit none
integer i, j, jj
character*1000 numstr
character*20 title(50)
character*1 :: tab=char(9)
numstr = 'titleab 1' //tab//'titleabc 2' //tab//'titleabcd 3' //tab// &
&'titleabcde 4' //tab//'titleabcdef 5' //tab//'titleabcdefg 6'//tab// &
&'titleabcdefgh 7'//tab//'titleabcdefghi 8'//tab//'titleabcdefghij 9'
jj = 1
do i = 1, 50
j = index(numstr(jj:),tab)
if(j == 0)then
title(i) = trim(numstr(jj:))
else
if (j > 21) stop 'cell size > allowed 20'
title(i) = numstr(jj:jj+j-2)
endif
print *, '|'//trim(title(i))//'|'
if(j == 0)exit
jj=jj+j
enddo
end |
if you wish to split off the title string from the number, you can add a couple of lines of code using INDEX(title(i),' '), as follows:
Code: | program xyz
implicit none
integer i, j, jj, k
integer, parameter :: NT = 50
character*1000 numstr
character*20 title(NT)
character*22 cell
integer num(NT)
character*1 :: tab=char(9)
numstr = 'titleab 1' //tab//'titleabc 2' //tab//'titleabcd 3' //tab// &
&'titleabcde 4' //tab//'titleabcdef 5' //tab//'titleabcdefg 6'//tab// &
&'titleabcdefgh 7'//tab//'titleabcdefghi 8'//tab//'titleabcdefghij 9'
jj = 1
do i = 1, NT
j = index(numstr(jj:),tab)
if(j == 0)then
cell = trim(numstr(jj:))
else
if (j > 21) stop 'cell size > allowed 20'
cell = numstr(jj:jj+j-2)
endif
k = index(cell,' ')
title(i) = cell(1:k-1)
read(cell(k+1:),'(I)')num(i)
print 10,i,title(i),num(i)
10 format(i2,' |',A,'| ',i5)
if(j == 0)exit
jj=jj+j
enddo
end |
|
|
Back to top |
|
 |
Notquitenewton
Joined: 25 May 2021 Posts: 20 Location: England, UK
|
Posted: Sat Jan 28, 2023 11:25 am Post subject: Tab separated fields |
|
|
To mecej4,
Many thanks for your reply and for taking the time. The one you wrote works perfectly. The problem I've encountered, it's not FTN95 or you but probably me, is when I try to read the string from a text file which would be the case, it gets confused...so do I.
Latest code (location of txt file is in directory xxxxx).
implicit none
integer i, j, jj
character*1000 numstr
character*20 title(50)
character*1 :: tab=char(9)
OPEN (UNIT=8,FILE='xxxxx\adatac.txt',STATUS='UNKNOWN')
read(8,'(A)') numstr
jj = 1
do i = 1, 50
j = index(numstr(jj:),tab)
if(j == 0)then
title(i) = trim(numstr(jj:))
else
if (j > 21) stop 'cell size > allowed 20'
title(i) = numstr(jj:jj+j-2)
endif
print *, '|'//trim(title(i))//'|'
if(j == 0)exit
jj=jj+j
enddo
close(unit=8,status='keep')
end
The txt file contains...
titleab 1 titleabc 2 titleabcd 3 titleabcde 4 titleabcdef 5
Space between title and number and a tab between number and next title.
I think the issue might be to do with tabs and spaces but I can't see it!
Thanks again.
Notquitenewton |
|
Back to top |
|
 |
Notquitenewton
Joined: 25 May 2021 Posts: 20 Location: England, UK
|
Posted: Sat Jan 28, 2023 12:11 pm Post subject: Tab separated fields |
|
|
mercej4,
A quick follow up...reading the string from a txt file didn't work but if I replace the line
j = index(numstr(jj:),tab)
with
j = index(numstr(jj:),' ')
it is showing signs of working but might have to tweak it a bit. When I used the sdbg (debugger) to see what the code was doing, the tab variable just showed a question mark! I guess this is a function of Fortran.
Notquitenewton |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1840
|
Posted: Sat Jan 28, 2023 12:15 pm Post subject: |
|
|
Please provide an exact copy of the data file adatac.txt .
Program source and/or data in text form, when posted in line in a user post in this and similar forums, can get mangled -- for instance, the tab characters that you are processing are either invisible, or may get replaced with spaces, etc. When such mangling occurs, the number of possible bugs increases, and debugging gets more difficult.
Please upload the file(s) to a cloud service (Dropbox, Google Drive, etc.) or the www.pcmodfit.co.uk site that you used in a previous post to this forum. |
|
Back to top |
|
 |
Notquitenewton
Joined: 25 May 2021 Posts: 20 Location: England, UK
|
Posted: Sat Jan 28, 2023 1:10 pm Post subject: Tab separated fields |
|
|
mecej4
To avoid scrambled text with the program, as you commented, the links are below but you'll have to change the file location to suit you. I've also added a space in the 2nd title to see what would happens and it was ok. The code works well but I've got to see where empty titles are coming from and get rid of the spaces before each title when it prints.
This routine will be invaluable, certainly to me (and others) as I export lots of data from Excel to txt files before running Fortran. Putting a keyboard tab in the line I mentioned instead of using the tab variable seems to work.
Thanks again for your help.
https://www.pcmodfit.co.uk/adatac.txt
https://www.pcmodfit.co.uk/stringnew.f90 |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1840
|
Posted: Sat Jan 28, 2023 2:46 pm Post subject: |
|
|
Here are three reasons why your program stringnew.f90 failed:
1. Your program does not search for and process tab characters (char(9)) at all.
2. In your data file, the second field contains "title abc 2". Note the space character between "title" and "abc". The program assumes that the titles have no embedded spaces.
3. There is a problem with the statement
Code: | j = index(numstr(jj:),' ') |
The file has an actual tab character within the single quotes, and that will not work. In fact, the program editor, compiler and RTL will replace that literal tab character with something else or cause the index() function to return 0.
The following program should work correctly after the data file is corrected as to Item 2. above. It does work with other compilers, but with FTN95 it stops after outputting one line. I am going to troubleshoot that a little while later.
Code: | program xyz
implicit none
integer i, j, jj, k
integer, parameter :: NT = 50
character*200 numstr
character*20 title(NT)
character*22 cell
integer num(NT)
character*1 :: tab=char(9)
open(11,file='adatac.txt',status='old')
read(11,'(A)')numstr
close(11)
jj = 1
do i = 1, NT
j = index(numstr(jj:),tab)
if(j == 0)then
cell = trim(numstr(jj:))
else
if (j > 21) stop 'cell size > allowed 20'
cell = numstr(jj:jj+j-2)
endif
k = index(cell,' ')
title(i) = cell(1:k-1)
read(cell(k+1:k+1),'(I1)')num(i)
print 10,i,title(i),num(i)
10 format(i2,' |',A,'| ',i5)
if(j == 0)exit
jj=jj+j
enddo
end |
|
|
Back to top |
|
 |
Notquitenewton
Joined: 25 May 2021 Posts: 20 Location: England, UK
|
Posted: Sat Jan 28, 2023 3:45 pm Post subject: Tab separated fields |
|
|
It seems that FTN95 does not recognize CHAR(9) as a character if searched for in a string.
When I run the debugger it shows tab (defined as CHAR(9)) as a ?.
This is the problem methinks.
I'm using v8.95.0
Notquitenewton |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1840
|
Posted: Sat Jan 28, 2023 3:53 pm Post subject: Bug in FTN95, inputting text containing tab characters |
|
|
The bug is not in INDEX(). It is in input conversion under format A. The tab characters are lost when a READ with A format is done. The character variable does not have any tab characters in it, and it is no surprise that INDEX cannot find tabs that are not present.
I have written up a bug report on the expansion of tab characters to a number of spaces ("Formatted read with A format converts tabs to spaces"):
http://forums.silverfrost.com/viewtopic.php?p=33764 |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2505 Location: Sydney
|
Posted: Sun Jan 29, 2023 7:42 am Post subject: |
|
|
The following code copes with either horizontal tabs (char(9)) are present or they have been replaced by at least 3 spaces.
In FTN95, "call READ_TABS@ (8)" will stop tabs being replaced by a variable number of spaces, which I think could be 0 to 8, rather than a minimum of 3.
It is much simpler to not have tabs replaced, to not have to cope when tabs are replaced by 0 to 2 spaces.
Code: | implicit none
integer i, k, jj, nc, jh, js
character numstr*1000
character title(50)*20
character*1 :: tab = char(9)
character :: file_name*50
file_name = 'c:\pcmodfit v7.7\Results\adatac.txt'
file_name = 'adataj.txt' ! adjusted file terminated with [CR][LF]
OPEN (UNIT=8, FILE=file_name, STATUS='UNKNOWN')
! call READ_TABS@ (8) ! activate to read tabs
read (8,'(A)') numstr
nc = len_trim (numstr)
numstr(nc+1:nc+1) = tab
jj = 1
i = 0
do k = 1, nc
if ( jj > nc ) exit
if ( numstr(jj:jj) == ' ') then ! left justify next field
write (*,11) jj,-1,0, numstr(jj:jj)
jj = jj+1
cycle
end if
jh = index (numstr(jj:),tab) ; if ( jh <= 0 ) jh = len(numstr)+2-jj
js = index (numstr(jj:),' ') ; if ( js <= 0 ) js = nc+2-jj
if (jh < js) then ! select field terminated with TAB
i = i+1
title(i) = numstr(jj:jj+jh-2)
write (*,11) jj,jh,i, trim (title(i)),'[HT]'
if (jh > 21) write (*,*) 'cell size > allowed 20'
jj = jj+jh
else ! select field terminated with 3 spaces
i = i+1
title(i) = numstr(jj:jj+js-1)
write (*,11) jj,js,i, trim (title(i))
if (js > 21) write (*,*) 'cell size > allowed 20'
jj = jj+js
end if
end do
11 format ( i4,i3,i2,': ',a,a)
close (unit=8) ! ,status='keep')
end |
|
|
Back to top |
|
 |
Notquitenewton
Joined: 25 May 2021 Posts: 20 Location: England, UK
|
Posted: Sun Jan 29, 2023 10:36 am Post subject: Tab separated fields |
|
|
JohnCampbell
This does work and it even lets the user add spaces in each subtitle...something I've been struggling with for some time now!
Thank you very much for this code...it will certainly make my life easier for handling subtitles when getting Excel to export cell contents into txt files with Tab separators.
Looking more closely at your code, it seems the READ_TABS@ ( command is the pivotal one which I didn't know. I often read parts of the manual now and then and obviously missed it!
I just tried this out in my suite of programs and it works like a dream.
Thanks again JohnCampbell.
Notquitenewton
 |
|
Back to top |
|
 |
DanRRight

Joined: 10 Mar 2008 Posts: 2777 Location: South Pole, Antarctica
|
Posted: Mon Jan 30, 2023 10:19 am Post subject: |
|
|
Two posts about tabs in two days...By statistics this should happened ones in ~1000 years
This is what i am telling all the time. This is what happens when there too little users plus all of them but few are procrastinators waiting for others to report the problems, bugs and make a suggestions for improvements. It took 30 years for users to ask about the most basic thing in Fortran - what is Tab, how long it is, how treat it correctly etc. All these decades they were just making their own workarounds or just not using Tab at all.
Company has to be more active to involve users into the process of getting better product. Advertise. Make good flashy examples, make reviews of new features. More users - more features, more profit.
Besides using Tabs in datafiles problems also exist with using Tabs in Fortran source codes and apps. By the way i was surprised to find that while i was using Tabs in Fortran sources all the time since FTN77/FTN90 -- all other users and all other Fortran compilers afraid Tab like a devilry even in the Fortran source codes. Gfortran and Intel for example warn if you use Tabs !
Also I recently switched editor from Notetab Pro to Kate and was also surprised that it handles Tabs in the free and fixed sources absolutely flawlessly including in its coloring schemes. All just works automatically without any intervention from the users to correct anything. This is the way Plato and %eb should do too. Again, if there were more users, somebody would suggested this long ago. |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2505 Location: Sydney
|
Posted: Thu Feb 02, 2023 3:23 am Post subject: |
|
|
Field delimiters have long been a problem for portability in Fortran. The 3 most common (from my usage) are :
* comma ","
* (horizontal) TAB
* semicolon ";"
comma is by far the most used delimiter, being used in .csv files, but has deficiencies with "business" notation and also the conflict between English and European number formats. Comma is a visible character so is good for text files and has been largely adopted via English EXCEL usage.
(What is the history of Europe replacing "." with "," !! I have received important numeric data, such as 1,250 from a European source to later be informed it meant 1.25 and the trailing 0 was not necessary. 1,25 would have been easier to question. They knew what numeric format I would be expecting!).
Horizontal Tab has been used for a long time, but suffers from not having a visible representation. In many editors it can be replaced by a "random" number of white spaces, which can change the text file format when being edited. The tab character is not included in the Fortran standard character set, it is not a valid character, which makes it's use so problematic. I first learnt to code on a 026 card punch, but never bothered to learn how to program the tabs for columns 7 and 73.
Semicolon is a great option, as it is a visible text character and does not conflict with number formats. Unfortunately it is not commonly used. Why haven't we had .ssv files more commonly used?
All the problems with hidden tabs or enclosing numbers in quotes ie "1,250" could have gone away !
[/quote] |
|
Back to top |
|
 |
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2385 Location: Yateley, Hants, UK
|
Posted: Thu Feb 02, 2023 11:14 am Post subject: |
|
|
John,
A really helpful exposition. I normally use CSV when exporting from Excel and didn't know (until I looked after reading your post) that exporting with tabs as a delimiter was possible. (I can't find SSV).
I used to use tabs to get to column 7 a lot. Why did that fall foul of the thought police?
Eddie |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1840
|
Posted: Thu Feb 02, 2023 1:14 pm Post subject: |
|
|
Semicolons as field separators have an advantage over the other characters (comma, tab) that they are rarely used within text field data. Tabs are not visible. Commas, single quotes and hyphens may occur in addresses and names of people, places, etc. |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|