Silverfrost Forums

Welcome to our forums

Tab separated fields

27 Jan 2023 4:52 #29883

I have a row of text. In this row there are subtitles containing a space and each subtitle is tab separated from another subtitle. I've roughly coded this (below) as an example and it is getting mixed up when the subtitle width including the space is about 8 which I think is the default tab. There is probably a much better way of doing this but I can't see it. Example... title space 1 tab titleabcd space 2 tab next title and so on. I'm copying a row from Excel into a text file and it is tab separated with 1 or 2 spaces in each cell. A bit puzzled at present. The subtitle can be anywhere between 1 and 20 with a couple of spaces. Any help welcomed.

  IMPLICIT NONE
  INCLUDE <Windows.ins> , nolist
  INTEGER i, j, jj
  INTEGER (KIND=2) kk
  CHARACTER*1000 numstr
  CHARACTER*20 title(50)
   numstr = 'titleab 1	titleabc 2	titleabcd 3	titleabcde 4
 &amp; 	titleabcdef 5	titleabcdefg 6	titleabcdefgh 7	
 &amp;	titleabcdefghi 8	titleabcdefghij 9'
  CALL compress@(numstr,kk)
  CALL trim@(numstr)
  do i = 1, 50
   CALL trim@(numstr)
   jj = leng(numstr)
   j = index(numstr(1:jj),CHAR(9))
   if (j .ne. 0) then
    title(i) = numstr(1:j-1)
    print *, title(i)
    numstr = numstr(j+1:jj)
   endif
  enddo
  end
28 Jan 2023 12:09 #29884

One of the undesirable consequences of using fixed size strings is having to contend with a lot of padding.

Does the following code satisfy your needs?

program xyz
   implicit none
   integer i, j, jj
   character*1000 numstr
   character*20 title(50)
   character*1 :: tab=char(9)
   numstr = 'titleab 1'       //tab//'titleabc 2'      //tab//'titleabcd 3'   //tab// &
            &'titleabcde 4'   //tab//'titleabcdef 5'   //tab//'titleabcdefg 6'//tab// &
            &'titleabcdefgh 7'//tab//'titleabcdefghi 8'//tab//'titleabcdefghij 9'
   jj = 1
   do i = 1, 50
      j = index(numstr(jj:),tab)
      if(j == 0)then
         title(i) = trim(numstr(jj:))
      else
         if (j > 21) stop 'cell size > allowed 20'
         title(i) = numstr(jj:jj+j-2)
      endif
      print *, '|'//trim(title(i))//'|'
      if(j == 0)exit
      jj=jj+j
   enddo
end

if you wish to split off the title string from the number, you can add a couple of lines of code using INDEX(title(i),' '), as follows:

program xyz
   implicit none
   integer i, j, jj, k
   integer, parameter :: NT = 50
   character*1000 numstr
   character*20 title(NT)
   character*22 cell
   integer num(NT)
   character*1 :: tab=char(9)
   numstr = 'titleab 1'       //tab//'titleabc 2'      //tab//'titleabcd 3'   //tab// &
            &'titleabcde 4'   //tab//'titleabcdef 5'   //tab//'titleabcdefg 6'//tab// &
            &'titleabcdefgh 7'//tab//'titleabcdefghi 8'//tab//'titleabcdefghij 9'
   jj = 1
   do i = 1, NT
      j = index(numstr(jj:),tab)
      if(j == 0)then
         cell = trim(numstr(jj:))
      else
         if (j > 21) stop 'cell size > allowed 20'
         cell = numstr(jj:jj+j-2)
      endif
      k = index(cell,' ')
      title(i) = cell(1:k-1)
      read(cell(k+1:),'(I)')num(i)
      print 10,i,title(i),num(i)
   10 format(i2,' |',A,'| ',i5)
      if(j == 0)exit
      jj=jj+j
   enddo
end
28 Jan 2023 10:25 #29887

To mecej4, Many thanks for your reply and for taking the time. The one you wrote works perfectly. The problem I've encountered, it's not FTN95 or you but probably me, is when I try to read the string from a text file which would be the case, it gets confused...so do I. Latest code (location of txt file is in directory xxxxx).

implicit none integer i, j, jj character1000 numstr character20 title(50) character*1 :: tab=char(9) OPEN (UNIT=8,FILE='xxxxx\adatac.txt',STATUS='UNKNOWN') read(8,'(A)') numstr jj = 1 do i = 1, 50 j = index(numstr(jj:),tab) if(j == 0)then title(i) = trim(numstr(jj:)) else if (j > 21) stop 'cell size > allowed 20' title(i) = numstr(jj:jj+j-2) endif print *, '|'//trim(title(i))//'|' if(j == 0)exit jj=jj+j enddo close(unit=8,status='keep') end

The txt file contains... titleab 1 titleabc 2 titleabcd 3 titleabcde 4 titleabcdef 5

Space between title and number and a tab between number and next title. I think the issue might be to do with tabs and spaces but I can't see it! Thanks again. Notquitenewton

28 Jan 2023 11:11 #29888

mercej4, A quick follow up...reading the string from a txt file didn't work but if I replace the line j = index(numstr(jj:),tab) with j = index(numstr(jj:),' ') it is showing signs of working but might have to tweak it a bit. When I used the sdbg (debugger) to see what the code was doing, the tab variable just showed a question mark! I guess this is a function of Fortran. Notquitenewton

28 Jan 2023 11:15 #29889

Please provide an exact copy of the data file adatac.txt .

Program source and/or data in text form, when posted in line in a user post in this and similar forums, can get mangled -- for instance, the tab characters that you are processing are either invisible, or may get replaced with spaces, etc. When such mangling occurs, the number of possible bugs increases, and debugging gets more difficult.

Please upload the file(s) to a cloud service (Dropbox, Google Drive, etc.) or the www.pcmodfit.co.uk site that you used in a previous post to this forum.

28 Jan 2023 12:10 #29890

mecej4 To avoid scrambled text with the program, as you commented, the links are below but you'll have to change the file location to suit you. I've also added a space in the 2nd title to see what would happens and it was ok. The code works well but I've got to see where empty titles are coming from and get rid of the spaces before each title when it prints. This routine will be invaluable, certainly to me (and others) as I export lots of data from Excel to txt files before running Fortran. Putting a keyboard tab in the line I mentioned instead of using the tab variable seems to work. Thanks again for your help.

https://www.pcmodfit.co.uk/adatac.txt

https://www.pcmodfit.co.uk/stringnew.f90

28 Jan 2023 1:46 #29891

Here are three reasons why your program stringnew.f90 failed: 1. Your program does not search for and process tab characters (char(9)) at all. 2. In your data file, the second field contains 'title abc 2'. Note the space character between 'title' and 'abc'. The program assumes that the titles have no embedded spaces. 3. There is a problem with the statement

j = index(numstr(jj:),'	')

The file has an actual tab character within the single quotes, and that will not work. In fact, the program editor, compiler and RTL will replace that literal tab character with something else or cause the index() function to return 0. The following program should work correctly after the data file is corrected as to Item 2. above. It does work with other compilers, but with FTN95 it stops after outputting one line. I am going to troubleshoot that a little while later.

program xyz
   implicit none
   integer i, j, jj, k
   integer, parameter :: NT = 50
   character*200 numstr
   character*20 title(NT)
   character*22 cell
   integer num(NT)
   character*1 :: tab=char(9)
   open(11,file='adatac.txt',status='old')
   read(11,'(A)')numstr
   close(11)
   jj = 1
   do i = 1, NT
      j = index(numstr(jj:),tab)
      if(j == 0)then
         cell = trim(numstr(jj:))
      else
         if (j > 21) stop 'cell size > allowed 20'
         cell = numstr(jj:jj+j-2)
      endif
      k = index(cell,' ')
      title(i) = cell(1:k-1)
      read(cell(k+1:k+1),'(I1)')num(i)
      print 10,i,title(i),num(i)
   10 format(i2,' |',A,'| ',i5)
      if(j == 0)exit
      jj=jj+j
   enddo
end
28 Jan 2023 2:45 #29892

It seems that FTN95 does not recognize CHAR(9) as a character if searched for in a string. When I run the debugger it shows tab (defined as CHAR(9)) as a ?.

This is the problem methinks.

I'm using v8.95.0

Notquitenewton

28 Jan 2023 2:53 #29894

The bug is not in INDEX(). It is in input conversion under format A. The tab characters are lost when a READ with A format is done. The character variable does not have any tab characters in it, and it is no surprise that INDEX cannot find tabs that are not present.

I have written up a bug report on the expansion of tab characters to a number of spaces ('Formatted read with A format converts tabs to spaces'):

  http://forums.silverfrost.com/viewtopic.php?p=33764
29 Jan 2023 6:42 #29896

The following code copes with either horizontal tabs (char(9)) are present or they have been replaced by at least 3 spaces. In FTN95, 'call READ_TABS@ (8)' will stop tabs being replaced by a variable number of spaces, which I think could be 0 to 8, rather than a minimum of 3. It is much simpler to not have tabs replaced, to not have to cope when tabs are replaced by 0 to 2 spaces.

   implicit none
   integer i, k, jj, nc, jh, js
   character numstr*1000
   character title(50)*20
   character*1 :: tab = char(9)
   character :: file_name*50

   file_name = 'c:\pcmodfit v7.7\Results\adatac.txt'
   file_name = 'adataj.txt'                ! adjusted file terminated with [CR][LF]

   OPEN (UNIT=8, FILE=file_name, STATUS='UNKNOWN')
!   call READ_TABS@ (8)                     ! activate to read tabs
   read (8,'(A)') numstr
   nc = len_trim (numstr)
   numstr(nc+1:nc+1) = tab

   jj = 1
   i  = 0
   do k = 1, nc
      if ( jj > nc ) exit

      if ( numstr(jj:jj) == ' ') then      ! left justify next field
        write (*,11) jj,-1,0, numstr(jj:jj)
        jj = jj+1
        cycle
      end if

      jh = index (numstr(jj:),tab)    ; if ( jh <= 0 ) jh = len(numstr)+2-jj
      js = index (numstr(jj:),'   ')  ; if ( js <= 0 ) js = nc+2-jj

      if (jh < js) then                    ! select field terminated with TAB
         i = i+1
         title(i) = numstr(jj:jj+jh-2)
         write (*,11) jj,jh,i, trim (title(i)),'[HT]'
         if (jh > 21) write (*,*) 'cell size > allowed 20'
         jj = jj+jh

      else                                 ! select field terminated with 3 spaces
         i = i+1
         title(i) = numstr(jj:jj+js-1)
         write (*,11) jj,js,i, trim (title(i))
         if (js > 21) write (*,*) 'cell size > allowed 20'
         jj = jj+js
      end if

   end do
 11 format ( i4,i3,i2,': ',a,a)

   close (unit=8)  ! ,status='keep')
   end
29 Jan 2023 9:36 #29897

JohnCampbell This does work and it even lets the user add spaces in each subtitle...something I've been struggling with for some time now! Thank you very much for this code...it will certainly make my life easier for handling subtitles when getting Excel to export cell contents into txt files with Tab separators. Looking more closely at your code, it seems the READ_TABS@ (8) command is the pivotal one which I didn't know. I often read parts of the manual now and then and obviously missed it! I just tried this out in my suite of programs and it works like a dream. Thanks again JohnCampbell. Notquitenewton 😄

30 Jan 2023 9:19 #29900

Two posts about tabs in two days...By statistics this should happened ones in ~1000 years 😃

This is what i am telling all the time. This is what happens when there too little users plus all of them but few are procrastinators waiting for others to report the problems, bugs and make a suggestions for improvements. It took 30 years for users to ask about the most basic thing in Fortran - what is Tab, how long it is, how treat it correctly etc. All these decades they were just making their own workarounds or just not using Tab at all.

Company has to be more active to involve users into the process of getting better product. Advertise. Make good flashy examples, make reviews of new features. More users - more features, more profit.

Besides using Tabs in datafiles problems also exist with using Tabs in Fortran source codes and apps. By the way i was surprised to find that while i was using Tabs in Fortran sources all the time since FTN77/FTN90 -- all other users and all other Fortran compilers afraid Tab like a devilry even in the Fortran source codes. Gfortran and Intel for example warn if you use Tabs !

Also I recently switched editor from Notetab Pro to Kate and was also surprised that it handles Tabs in the free and fixed sources absolutely flawlessly including in its coloring schemes. All just works automatically without any intervention from the users to correct anything. This is the way Plato and %eb should do too. Again, if there were more users, somebody would suggested this long ago.

2 Feb 2023 2:23 #29911

Field delimiters have long been a problem for portability in Fortran. The 3 most common (from my usage) are :

  • comma ','
  • (horizontal) TAB
  • semicolon ';'

comma is by far the most used delimiter, being used in .csv files, but has deficiencies with 'business' notation and also the conflict between English and European number formats. Comma is a visible character so is good for text files and has been largely adopted via English EXCEL usage. (What is the history of Europe replacing '.' with ',' !! I have received important numeric data, such as 1,250 from a European source to later be informed it meant 1.25 and the trailing 0 was not necessary. 1,25 would have been easier to question. They knew what numeric format I would be expecting!).

Horizontal Tab has been used for a long time, but suffers from not having a visible representation. In many editors it can be replaced by a 'random' number of white spaces, which can change the text file format when being edited. The tab character is not included in the Fortran standard character set, it is not a valid character, which makes it's use so problematic. I first learnt to code on a 026 card punch, but never bothered to learn how to program the tabs for columns 7 and 73.

Semicolon is a great option, as it is a visible text character and does not conflict with number formats. Unfortunately it is not commonly used. Why haven't we had .ssv files more commonly used? All the problems with hidden tabs or enclosing numbers in quotes ie '1,250' could have gone away ! [/quote]

2 Feb 2023 10:14 #29913

John,

A really helpful exposition. I normally use CSV when exporting from Excel and didn't know (until I looked after reading your post) that exporting with tabs as a delimiter was possible. (I can't find SSV).

I used to use tabs to get to column 7 a lot. Why did that fall foul of the thought police?

Eddie

2 Feb 2023 12:14 #29914

Semicolons as field separators have an advantage over the other characters (comma, tab) that they are rarely used within text field data. Tabs are not visible. Commas, single quotes and hyphens may occur in addresses and names of people, places, etc.

2 Feb 2023 12:23 #29915

Quoted from LitusSaxonicum

I used to use tabs to get to column 7 a lot. Eddie

Tabs in data files do not have the same intended meaning as in Fortran fixed form source files. When a user creates a Fortran source file using a text editor, pressing the tab key may cause the cursor to move right to the next tab stop, but the tab characters are usually not saved to the Fortran source file, just to avoid trouble with the compiler.

FTN95 provides a /tabs option (appears to be selected, by default) to control how tabs are treated. Plato has an option (selected by default) to replace tabs by spaces.

My experience is that having tab characters in Fortran source files amounts to asking for trouble. Card punch machines and typewriters had control drums and tab bars for setting tab stops, but the output cards and paper sheets had only spaces.

Please login to reply.