forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Which FORMAT to READ better ?

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sat Nov 29, 2014 4:30 pm    Post subject: Which FORMAT to READ better ? Reply with quote

Formatting input/output in Fortran has a lot and lot of options. And is the most trickiest its part. You end up analysing the data, assessing its type, ways data separated, amount of elements ...whole hell a lot of things...By the amount of sweat and swear it only equivalent to our beloved CWP

What will be best format to read the array of data of arbitrary length with individual numbers separated by the space like this

1.59E-20 1.59E-20 1.59E-20 1.58E-20 1.58E-20 .....

The known and popular smart and simple * format for example

READ(100,*) ARRAY

could be here very convenient if the length of ARRAY(20) and data are known ( data is longer then 20 in this case). It takes care of any separations, comma, space of tabs between the numbers. But it is goes automatically to read the next line if amount of numbers in the line smaller then declared in ARRAY which could be both good or bad.

Is there similar to * format which does not jump to read the next line but reads exactly one single line?
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Sat Nov 29, 2014 8:10 pm    Post subject: Reply with quote

If all the fields in a line are of the same width w, even for any negative numbers present in the line, and there is always a single blank between fields, for n fields the length of the line with trailing blanks removed is L = n.w + (n-1).

What you can do, therefore, is to read the line into a sufficiently long character string, find the length of the trimmed line, and calculate n = (L+1)/(w+1). You can then read the values into the variables using a sequence of internal reads starting from positions 1,w+1,2w+1, etc.

If the fields are of variable width, you can scan and locate the blanks (or other field separator characters) in the string, and read from the next non-blank character.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sat Nov 29, 2014 10:59 pm    Post subject: Reply with quote

That's approximately how i usually do that. I have done so many times that something broke today in me. I do not like to program that anymore.

Hope someone will uncover some hidden options with star * format or offer something different and simple.

Fortran has to advance to simplicity and have some other, friendly and easy ways to do that kind of semi-automatically, without much programming, using ideally just clicks (visually/graphically). Otherwise each new user with his own output files makes me program the way to load his data, count spaces, start and end positions etcetcetc and waste tons of time. Then user changes something a little and whole work needs to be painfully adjusted again

May be someone willing to create Clearwin-based Visual Format Designer ? You drop structured file into it and it will show you in visual fields how it will be reading it together with the Fortran text to insert into your code. If you do not like something you just correct that with the mouse. This will be useful tool, all need it
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Sun Nov 30, 2014 4:55 am    Post subject: Reply with quote

Dan,

I have just recently found a related problem using gFortran with .csv format.
If I have a .csv format of: 1.23, 4.77772, 1, 3.45, 1000000.00, 3.,
Read (lu,*) a(1:6) will read this, IF I know how many numbers there are in the line.
FTN95 will read this with READ (lu, fmt='(bn,10f15.0)') a(1:8)
but gFortran will not ! The use of , separators with 10f15.0 is apparently an extension.
I think that read (lu,*) accepts both space and , separators. (I could be wrong here also?)
The problem I have found with (lu,*) is that if insufficient numbers are provided then you will get an end of record error or the READ will use the next line to complete the read list.
My solution is to read the line of data into a character string and parse the data to numbers separated by either a trailing space or ",". I then read the single value and store it in an array, counting the number of values obtained.
Often it is useful to differentiate between , 0, and , , as being zero or a "not defined value".
I have been trying to write a good free format reader for many years and never have found the perfect solution.

John
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sun Nov 30, 2014 4:36 pm    Post subject: Reply with quote

Yes, it is not easy to make anything universal. The star* format was probably one such attempt and it almost got there excluding its extremely error-prone behavior to get to another line which is kind of logical but practically is a design defect. The **, or #, ^, $ format must be then introduced to not do that and we will get substantial simplification with our formatted reading needs. When reading gets to the end of the line all other variables in the list just have to be ignored (and turned to initial state if reading get some trash) and error generated. The programmer will handle the error message accordingly.

I found that star format *on output* works a bit differently in other compilers. Its output looks a bit more pleasant and natural, i'd say. Will summarize that when will have time
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Mon Dec 01, 2014 12:06 am    Post subject: Reply with quote

Dan,

Perhaps this forum isn’t the right place to ask for mods to standard fortran.

GENERAL RAMBLE

Subject to correction by someone who knows better, * formatting was invented for Fortran 77. I used a Fortran 66 (for ICL 1900/2900 series computers) that had a format 0 which was more or less the same thing. It also had formats such as I0 and F0.0 Nothing better has been introduced as far as I can see since, except maybe the wider adoption of list-directed i/o although I personally never use it as it doesn’t fit with the way I do things.

Some chums of mine once programmed a complete data description language which allowed for simple formulae as well as values (I’m sure it wasn’t unique) but you definitely needed to know in the program how many variables and their types you were expecting at any instant.
There shouldn’t be any problem in matching output from one fortran program with input to another as then you know what format it was written in, but Dan’s problem is that he doesn’t know how many data items he has in the file (or each line of the file), how they are delimited (or is this someone else’s insertion) nor what type they are. “You end up analysing the data” he says, and of course, there is no alternative, if you have no control over how the data stream is written in the first place. The problem is to parse the data stream, and that requires, I suspect, all the arts of the compiler writer.

If you have control over how that data set is written in the first place, you have more chance of being able to read it. If, for example, you wanted to read a list of numbers it would be simplest if they were output one-per-line, with a line containing some character(s) that might trigger an ERR= in the read statement. You could also hint at what type the data item was, maybe by putting something like the letter I in column 1 for an integer.
List directed input also assumes that you have control over the creation of the data set in the first place.

POSSIBLE SOLUTION?

If one has to “end up analysing the data”, then the FTN77 library routine READFA@ seems useful, as it finds the length of a line in characters as well as reading it into a CHARACTER*(*) variable to analyse, and that removes one of the problems in “analysing it”. You might need to open the file with OPENR@ (or OPENV@ if you understand the description of what it does!).

It occurs to me that if you used this string and the separators conformed to the rules for command lines, you could create a pseudo command line with SET_COMMAND_LINE@ and then use the FTN77 library command line processing functions to sort out the values. If the separators didn’t conform, you could always replace all the invalid characters with <space> in a preliminary pass. CMNARGS@ for instance will tell you how many arguments there are. There are also routines to get the ‘tokens’ off the command line in various ways.

However, you’ll never be able to determine if (say) 12345 is the integer value, or 12345.0

Eddie
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Mon Dec 01, 2014 2:26 am    Post subject: Reply with quote

Good points, Eddie. I will look again into all that READFA@ OPENR@ etc. At some point two decades ago i found them buggy in addition to being non-standard conforming and never returned back. I need to simplify and shorten the parsing text, making it way more readable

But the ideal solution as i can imagine would be may be completely different. It has to be higher level visual tool for analysis of data files which interacts with the programmer. It has to show giant EXCEL-like table with the fields it found readable, parsed structured repeated data fields, and at the end if asked you integer is 12345 or real 12345.0. If you say "I don't care" it will take it as a default (integer).

In my case i need to load to analyze some third party Fortran (and C lately) code generated data files in dozen formats. Users of course change output way too often. Each time something changes, sometimes one single space, i get reading bugs and with swearing start searching pages and pages of my boring fortran text which is parsing all that.

With my imaginable supertool the problems would be solved In 0.5 second and complete fortran text generated to read this specific file.

Parsing parsing parsing....I wrote so many warning on all kinds of potential reading errors that my fortran texts are completely not readable now. Yes, when they work they work like a charm - showing you exactly offending place in popup window highlighted in color. But when they fail - i spend sometimes a lot of time catching the error in all that parsing abracadabra
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Mon Dec 01, 2014 8:24 am    Post subject: Reply with quote

Dan,

I'm all over the place on this topic.

I have written parsing routines to read a list of numbers, even included in the number definition simple maths (+ - / * ) and once ^ or **. Others have even introduced functions SIN, COS, SQRT, which are useful for converting between Cartesian and Polar coordinates.
Another feature which looks good that I have not (yet) tried is to give each column a NAME and allow an election of default order of numbers.

However, I think the reason I have never persisted with these approaches is that I typically generate a lot of data in Excel and then export the data as either .csv or more often fixed format .prn format, both of which are very easy to read with FTN95.

I think that flexible data formats have been mitigated by Excel, with the next frontier to be reading directly from the excel sheet, which again is often too difficult to justify. I keep the master data set in a .xls file and the interface copy in .prn or .csv. The problem with a .csv master is it does not support formula, so .xls is still required.

So Dan, the answer to this problem varies depending on how much you can control the data and where it comes from. For me most data is provided in an Excel format so the need for a good free format solution is a thing of the past.

John
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Mon Dec 01, 2014 5:55 pm    Post subject: Re: Reply with quote

JohnCampbell wrote:

I think that flexible data formats have been mitigated by Excel, with the next frontier to be reading directly from the excel sheet, which again is often too difficult to justify.

Long time ago I posted a simple FTN95 callable DLL with some OLE Automation support for Excel. It supported creating, deleting, saving, naming and selecting workbooks and worksheets. You could read cells and write into cells.

Did you try it's functionality? I did not get any feed back, so I did not add more functionality.

It's available here.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group