forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Tab separated fields
Goto page 1, 2  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
Notquitenewton



Joined: 25 May 2021
Posts: 20
Location: England, UK

PostPosted: Fri Jan 27, 2023 5:52 pm    Post subject: Tab separated fields Reply with quote

I have a row of text. In this row there are subtitles containing a space and each subtitle is tab separated from another subtitle. I've roughly coded this (below) as an example and it is getting mixed up when the subtitle width including the space is about 8 which I think is the default tab. There is probably a much better way of doing this but I can't see it.
Example...
title space 1 tab titleabcd space 2 tab next title and so on.
I'm copying a row from Excel into a text file and it is tab separated with 1 or 2 spaces in each cell. A bit puzzled at present. The subtitle can be anywhere between 1 and 20 with a couple of spaces. Any help welcomed.


IMPLICIT NONE
INCLUDE <Windows.ins> , nolist
INTEGER i, j, jj
INTEGER (KIND=2) kk
CHARACTER*1000 numstr
CHARACTER*20 title(50)
numstr = 'titleab 1 titleabc 2 titleabcd 3 titleabcde 4
& titleabcdef 5 titleabcdefg 6 titleabcdefgh 7
& titleabcdefghi 8 titleabcdefghij 9'
CALL compress@(numstr,kk)
CALL trim@(numstr)
do i = 1, 50
CALL trim@(numstr)
jj = leng(numstr)
j = index(numstr(1:jj),CHAR(9))
if (j .ne. 0) then
title(i) = numstr(1:j-1)
print *, title(i)
numstr = numstr(j+1:jj)
endif
enddo
end
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Sat Jan 28, 2023 1:09 am    Post subject: Reply with quote

One of the undesirable consequences of using fixed size strings is having to contend with a lot of padding.

Does the following code satisfy your needs?

Code:
program xyz
   implicit none
   integer i, j, jj
   character*1000 numstr
   character*20 title(50)
   character*1 :: tab=char(9)
   numstr = 'titleab 1'       //tab//'titleabc 2'      //tab//'titleabcd 3'   //tab// &
            &'titleabcde 4'   //tab//'titleabcdef 5'   //tab//'titleabcdefg 6'//tab// &
            &'titleabcdefgh 7'//tab//'titleabcdefghi 8'//tab//'titleabcdefghij 9'
   jj = 1
   do i = 1, 50
      j = index(numstr(jj:),tab)
      if(j == 0)then
         title(i) = trim(numstr(jj:))
      else
         if (j > 21) stop 'cell size > allowed 20'
         title(i) = numstr(jj:jj+j-2)
      endif
      print *, '|'//trim(title(i))//'|'
      if(j == 0)exit
      jj=jj+j
   enddo
end

if you wish to split off the title string from the number, you can add a couple of lines of code using INDEX(title(i),' '), as follows:
Code:
program xyz
   implicit none
   integer i, j, jj, k
   integer, parameter :: NT = 50
   character*1000 numstr
   character*20 title(NT)
   character*22 cell
   integer num(NT)
   character*1 :: tab=char(9)
   numstr = 'titleab 1'       //tab//'titleabc 2'      //tab//'titleabcd 3'   //tab// &
            &'titleabcde 4'   //tab//'titleabcdef 5'   //tab//'titleabcdefg 6'//tab// &
            &'titleabcdefgh 7'//tab//'titleabcdefghi 8'//tab//'titleabcdefghij 9'
   jj = 1
   do i = 1, NT
      j = index(numstr(jj:),tab)
      if(j == 0)then
         cell = trim(numstr(jj:))
      else
         if (j > 21) stop 'cell size > allowed 20'
         cell = numstr(jj:jj+j-2)
      endif
      k = index(cell,' ')
      title(i) = cell(1:k-1)
      read(cell(k+1:),'(I)')num(i)
      print 10,i,title(i),num(i)
   10 format(i2,' |',A,'| ',i5)
      if(j == 0)exit
      jj=jj+j
   enddo
end
Back to top
View user's profile Send private message
Notquitenewton



Joined: 25 May 2021
Posts: 20
Location: England, UK

PostPosted: Sat Jan 28, 2023 11:25 am    Post subject: Tab separated fields Reply with quote

To mecej4,
Many thanks for your reply and for taking the time. The one you wrote works perfectly. The problem I've encountered, it's not FTN95 or you but probably me, is when I try to read the string from a text file which would be the case, it gets confused...so do I.
Latest code (location of txt file is in directory xxxxx).

implicit none
integer i, j, jj
character*1000 numstr
character*20 title(50)
character*1 :: tab=char(9)
OPEN (UNIT=8,FILE='xxxxx\adatac.txt',STATUS='UNKNOWN')
read(8,'(A)') numstr
jj = 1
do i = 1, 50
j = index(numstr(jj:),tab)
if(j == 0)then
title(i) = trim(numstr(jj:))
else
if (j > 21) stop 'cell size > allowed 20'
title(i) = numstr(jj:jj+j-2)
endif
print *, '|'//trim(title(i))//'|'
if(j == 0)exit
jj=jj+j
enddo
close(unit=8,status='keep')
end

The txt file contains...
titleab 1 titleabc 2 titleabcd 3 titleabcde 4 titleabcdef 5

Space between title and number and a tab between number and next title.
I think the issue might be to do with tabs and spaces but I can't see it!
Thanks again.
Notquitenewton
Back to top
View user's profile Send private message
Notquitenewton



Joined: 25 May 2021
Posts: 20
Location: England, UK

PostPosted: Sat Jan 28, 2023 12:11 pm    Post subject: Tab separated fields Reply with quote

mercej4,
A quick follow up...reading the string from a txt file didn't work but if I replace the line
j = index(numstr(jj:),tab)
with
j = index(numstr(jj:),' ')
it is showing signs of working but might have to tweak it a bit. When I used the sdbg (debugger) to see what the code was doing, the tab variable just showed a question mark! I guess this is a function of Fortran.
Notquitenewton
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Sat Jan 28, 2023 12:15 pm    Post subject: Reply with quote

Please provide an exact copy of the data file adatac.txt .

Program source and/or data in text form, when posted in line in a user post in this and similar forums, can get mangled -- for instance, the tab characters that you are processing are either invisible, or may get replaced with spaces, etc. When such mangling occurs, the number of possible bugs increases, and debugging gets more difficult.

Please upload the file(s) to a cloud service (Dropbox, Google Drive, etc.) or the www.pcmodfit.co.uk site that you used in a previous post to this forum.
Back to top
View user's profile Send private message
Notquitenewton



Joined: 25 May 2021
Posts: 20
Location: England, UK

PostPosted: Sat Jan 28, 2023 1:10 pm    Post subject: Tab separated fields Reply with quote

mecej4
To avoid scrambled text with the program, as you commented, the links are below but you'll have to change the file location to suit you. I've also added a space in the 2nd title to see what would happens and it was ok. The code works well but I've got to see where empty titles are coming from and get rid of the spaces before each title when it prints.
This routine will be invaluable, certainly to me (and others) as I export lots of data from Excel to txt files before running Fortran. Putting a keyboard tab in the line I mentioned instead of using the tab variable seems to work.
Thanks again for your help.

https://www.pcmodfit.co.uk/adatac.txt

https://www.pcmodfit.co.uk/stringnew.f90
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Sat Jan 28, 2023 2:46 pm    Post subject: Reply with quote

Here are three reasons why your program stringnew.f90 failed:
    1. Your program does not search for and process tab characters (char(9)) at all.
    2. In your data file, the second field contains "title abc 2". Note the space character between "title" and "abc". The program assumes that the titles have no embedded spaces.
    3. There is a problem with the statement
    Code:
    j = index(numstr(jj:),'   ')

    The file has an actual tab character within the single quotes, and that will not work. In fact, the program editor, compiler and RTL will replace that literal tab character with something else or cause the index() function to return 0.

The following program should work correctly after the data file is corrected as to Item 2. above. It does work with other compilers, but with FTN95 it stops after outputting one line. I am going to troubleshoot that a little while later.
Code:
program xyz
   implicit none
   integer i, j, jj, k
   integer, parameter :: NT = 50
   character*200 numstr
   character*20 title(NT)
   character*22 cell
   integer num(NT)
   character*1 :: tab=char(9)
   open(11,file='adatac.txt',status='old')
   read(11,'(A)')numstr
   close(11)
   jj = 1
   do i = 1, NT
      j = index(numstr(jj:),tab)
      if(j == 0)then
         cell = trim(numstr(jj:))
      else
         if (j > 21) stop 'cell size > allowed 20'
         cell = numstr(jj:jj+j-2)
      endif
      k = index(cell,' ')
      title(i) = cell(1:k-1)
      read(cell(k+1:k+1),'(I1)')num(i)
      print 10,i,title(i),num(i)
   10 format(i2,' |',A,'| ',i5)
      if(j == 0)exit
      jj=jj+j
   enddo
end
Back to top
View user's profile Send private message
Notquitenewton



Joined: 25 May 2021
Posts: 20
Location: England, UK

PostPosted: Sat Jan 28, 2023 3:45 pm    Post subject: Tab separated fields Reply with quote

It seems that FTN95 does not recognize CHAR(9) as a character if searched for in a string.
When I run the debugger it shows tab (defined as CHAR(9)) as a ?.

This is the problem methinks.

I'm using v8.95.0

Notquitenewton
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Sat Jan 28, 2023 3:53 pm    Post subject: Bug in FTN95, inputting text containing tab characters Reply with quote

The bug is not in INDEX(). It is in input conversion under format A. The tab characters are lost when a READ with A format is done. The character variable does not have any tab characters in it, and it is no surprise that INDEX cannot find tabs that are not present.

I have written up a bug report on the expansion of tab characters to a number of spaces ("Formatted read with A format converts tabs to spaces"):

http://forums.silverfrost.com/viewtopic.php?p=33764
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2551
Location: Sydney

PostPosted: Sun Jan 29, 2023 7:42 am    Post subject: Reply with quote

The following code copes with either horizontal tabs (char(9)) are present or they have been replaced by at least 3 spaces.
In FTN95, "call READ_TABS@ (8)" will stop tabs being replaced by a variable number of spaces, which I think could be 0 to 8, rather than a minimum of 3.
It is much simpler to not have tabs replaced, to not have to cope when tabs are replaced by 0 to 2 spaces.
Code:
   implicit none
   integer i, k, jj, nc, jh, js
   character numstr*1000
   character title(50)*20
   character*1 :: tab = char(9)
   character :: file_name*50

   file_name = 'c:\pcmodfit v7.7\Results\adatac.txt'
   file_name = 'adataj.txt'                ! adjusted file terminated with [CR][LF]

   OPEN (UNIT=8, FILE=file_name, STATUS='UNKNOWN')
!   call READ_TABS@ (8)                     ! activate to read tabs
   read (8,'(A)') numstr
   nc = len_trim (numstr)
   numstr(nc+1:nc+1) = tab

   jj = 1
   i  = 0
   do k = 1, nc
      if ( jj > nc ) exit

      if ( numstr(jj:jj) == ' ') then      ! left justify next field
        write (*,11) jj,-1,0, numstr(jj:jj)
        jj = jj+1
        cycle
      end if

      jh = index (numstr(jj:),tab)    ; if ( jh <= 0 ) jh = len(numstr)+2-jj
      js = index (numstr(jj:),'   ')  ; if ( js <= 0 ) js = nc+2-jj

      if (jh < js) then                    ! select field terminated with TAB
         i = i+1
         title(i) = numstr(jj:jj+jh-2)
         write (*,11) jj,jh,i, trim (title(i)),'[HT]'
         if (jh > 21) write (*,*) 'cell size > allowed 20'
         jj = jj+jh

      else                                 ! select field terminated with 3 spaces
         i = i+1
         title(i) = numstr(jj:jj+js-1)
         write (*,11) jj,js,i, trim (title(i))
         if (js > 21) write (*,*) 'cell size > allowed 20'
         jj = jj+js
      end if

   end do
 11 format ( i4,i3,i2,': ',a,a)

   close (unit=8)  ! ,status='keep')
   end
Back to top
View user's profile Send private message
Notquitenewton



Joined: 25 May 2021
Posts: 20
Location: England, UK

PostPosted: Sun Jan 29, 2023 10:36 am    Post subject: Tab separated fields Reply with quote

JohnCampbell
This does work and it even lets the user add spaces in each subtitle...something I've been struggling with for some time now!
Thank you very much for this code...it will certainly make my life easier for handling subtitles when getting Excel to export cell contents into txt files with Tab separators.
Looking more closely at your code, it seems the READ_TABS@ (Cool command is the pivotal one which I didn't know. I often read parts of the manual now and then and obviously missed it!
I just tried this out in my suite of programs and it works like a dream.
Thanks again JohnCampbell.
Notquitenewton
Very Happy
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Mon Jan 30, 2023 10:19 am    Post subject: Reply with quote

Two posts about tabs in two days...By statistics this should happened ones in ~1000 years Smile

This is what i am telling all the time. This is what happens when there too little users plus all of them but few are procrastinators waiting for others to report the problems, bugs and make a suggestions for improvements. It took 30 years for users to ask about the most basic thing in Fortran - what is Tab, how long it is, how treat it correctly etc. All these decades they were just making their own workarounds or just not using Tab at all.

Company has to be more active to involve users into the process of getting better product. Advertise. Make good flashy examples, make reviews of new features. More users - more features, more profit.

Besides using Tabs in datafiles problems also exist with using Tabs in Fortran source codes and apps. By the way i was surprised to find that while i was using Tabs in Fortran sources all the time since FTN77/FTN90 -- all other users and all other Fortran compilers afraid Tab like a devilry even in the Fortran source codes. Gfortran and Intel for example warn if you use Tabs !

Also I recently switched editor from Notetab Pro to Kate and was also surprised that it handles Tabs in the free and fixed sources absolutely flawlessly including in its coloring schemes. All just works automatically without any intervention from the users to correct anything. This is the way Plato and %eb should do too. Again, if there were more users, somebody would suggested this long ago.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2551
Location: Sydney

PostPosted: Thu Feb 02, 2023 3:23 am    Post subject: Reply with quote

Field delimiters have long been a problem for portability in Fortran. The 3 most common (from my usage) are :
* comma ","
* (horizontal) TAB
* semicolon ";"

comma is by far the most used delimiter, being used in .csv files, but has deficiencies with "business" notation and also the conflict between English and European number formats. Comma is a visible character so is good for text files and has been largely adopted via English EXCEL usage.
(What is the history of Europe replacing "." with "," !! I have received important numeric data, such as 1,250 from a European source to later be informed it meant 1.25 and the trailing 0 was not necessary. 1,25 would have been easier to question. They knew what numeric format I would be expecting!).

Horizontal Tab has been used for a long time, but suffers from not having a visible representation. In many editors it can be replaced by a "random" number of white spaces, which can change the text file format when being edited. The tab character is not included in the Fortran standard character set, it is not a valid character, which makes it's use so problematic. I first learnt to code on a 026 card punch, but never bothered to learn how to program the tabs for columns 7 and 73.

Semicolon is a great option, as it is a visible text character and does not conflict with number formats. Unfortunately it is not commonly used. Why haven't we had .ssv files more commonly used?
All the problems with hidden tabs or enclosing numbers in quotes ie "1,250" could have gone away !
[/quote]
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Thu Feb 02, 2023 11:14 am    Post subject: Reply with quote

John,

A really helpful exposition. I normally use CSV when exporting from Excel and didn't know (until I looked after reading your post) that exporting with tabs as a delimiter was possible. (I can't find SSV).

I used to use tabs to get to column 7 a lot. Why did that fall foul of the thought police?

Eddie
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Thu Feb 02, 2023 1:14 pm    Post subject: Reply with quote

Semicolons as field separators have an advantage over the other characters (comma, tab) that they are rarely used within text field data. Tabs are not visible. Commas, single quotes and hyphens may occur in addresses and names of people, places, etc.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group