Generating an array from binary data

5 views (last 30 days)
Hi,
I'm having a little trouble generating an array from a binary file.
The data is contained in the attached file.
I'm using the snippet of code below to read the data.
path = 'd_300x300-Oxs_TimeDriver-Magnetization-000000026-0000105.txt';
A = fopen(path);
fseek(A,987,'bof');
B = fread(A,[160,20],'8*uint8=>uint8',16,'b');
C = typecast(B(:,1),'double');
dlmwrite('output.txt',C);
The data shows the pixel value element-by-element, for a 20x20-pixel array. So, I'm trying to read the data directly in to its pixel location ie. put the first 20 values in to a column, then the next 20 values in to the next column, and so on.
When I inspect the text format here from output.txt, the data comes out perfectly well in column format. The first data point is indeed 8.3524e5. However, when I scan across the row, the readout is incorrect (the first value should again be 8.3524e5).
I believe that the error is coming about because Matlab is not recognizing that one element is comprised of 8 bits. It recognizes that 8 bits == 1 column, but it does not read 8 bits == 1 element, hence the required [160,20] output size.
Does anyone know how to fix this?
Thanks very much in advance,
Carl
  3 Comments
Carl
Carl on 29 Jun 2014
Hi Geoff,
Thank you for responding!
Your solution
B = fread(fid,[20,20],'1*double',16);
works perfectly :-) Yes the skipping is intentional: physically, the data corresponds to the x/y/z components of magnetization of a material, and the skipping is needed to look at just one component. Also the symmetry of the matrix is expected, as the system is simple and highly symmetric (I'm looking at a much simplified model, in order to build up the code first).
I have one other question. Is it possible to keep the data in binary format, within Matlab's workspace? Using '1*double' definitely works, but I know this code will rapidly have to handle extremely large matricies, so I'm keen to keep the data in binary format for as long as possible, to make the code as fast as possible.
Thanks once again,
Carl
Geoff Hayes
Geoff Hayes on 29 Jun 2014
Edited: Geoff Hayes on 29 Jun 2014
Carl - as for keeping the data in a binary format within MATLAB's workspace, I'm not sure how that can be done and whether that would be better or worse (?) than just reading from file and creating the matrix.
One concept that I'm not all that familiar with (but have glanced at!) is memmapfile. See http://www.mathworks.com/help/matlab/ref/memmapfile.html for details. From the link,
Memory-mapping is a mechanism that maps a portion of a file, or an entire file, on disk to a range of memory addresses within the MATLAB® address space. Then, MATLAB can access files on disk in the same way it accesses dynamic memory, accelerating file reading and writing. Memory-mapping allows you to work with data in a file as if it were a MATLAB array.
The overview of memory mapping ( http://www.mathworks.com/help/matlab/import_export/overview-of-memory-mapping.html) discusses when or why you may want to use this (what the benefits are etc.)
For your example, this is how you would set it up
filename = 'd_300x300-Oxs_TimeDriver-Magnetization-000000026-0000105.txt';
memMappedData = memmapfile(filename,...
'Format',{'double',[3 20 20],'xyzComps'},'Offset',987);
memMappedData is a structure with the following fields
memMappedData =
Filename: '/Users/geoff/Development/bitbucket_repos/matlab/d_300x300-Oxs_TimeDriver-Magnetization-000000026-0000105.txt'
Writable: false
Offset: 987
Format: {'double' [3 20 20] 'xyzComps'}
Repeat: Inf
Data: 1x1 struct array with fields:
xyzComps
So it is very similar to what we had before. There is an offset of 987 bytes which is equivalent to the fseek(fid,987,'bof');, and we have access to all of the x/y/z component data in a 3x20x20 matrix.
We can access the data from the memMappedData object as
memMappedData.Data.xyzComps
To get at the first and last elements of component (x) we could do something like
memMappedData.Data.xyzComps(1,1,1) % first
memMappedData.Data.xyzComps(1,20,20) % last
To get the B that we had in the previous code, just do
B = squeeze(memMappedData.Data.xyzComps(1,:,:))
This is a neat alternative, but without much experience on it, I can't say for certain if it is a better alternative.

Sign in to comment.

Accepted Answer

Geoff Hayes
Geoff Hayes on 29 Jun 2014
For every 8 bytes of data that the code reads in, 16 bytes are then skipped. As Carl has already answered, this is intentional due to the data corresponding to the x/y/z components of magnetization of a material, and the skipping is needed to look at just one component.
I ran your code and obtained the same results: the first element of the first column was 835237.585310508, and the first element of the second column was something else (837157.626801806).
I tried something different by replacing the
B = fread(A,[160,20],'8*uint8=>uint8',16,'b');
C = typecast(B(:,1),'double');
with
B = fread(fid,[20,20],'1*double',16);
For every double read, we skip two. This line of code returned the same results as the reading in of 8 unsigned integers and the use of typecast.
What I did notice about B is that the last column is identical to the first but in reverse order, the second last column is identical to the second but in reverse order. In fact they all were near-identical if I ran the following
for k=1:10
if max(abs(B(:,k) - flipud(B(:,20-k+1))))<0.00000001
fprintf('near-identical for %d!\n',k);
end
end
Again, as Carl has already answered, this behaviour is expected as the system is simple and highly symmetric.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!