GNU tar problem

I've been using a 1998 version of Gnu tar for years and it has worked fine. Recently, it has started balking at working correctly. It's suffering from bit rot with age?

The specific problem is that (almost) regardless of the file it's told to archive, it returns an error message of ": Unknown file type; file ignored." What's really mystifying is how a file can be of the wrong type for tar since tar shouldn't be concerned with a file "type". [I strongly suspect the error message is erroneous.]

Has anyone else encountered this problem?

Reply to
Everett M. Greene
Loading thread data ...

And this relates to embedded systems how?

That set aside, even if you think the failure happens regardless of the file it's told to archive, that doesn't mean we can actually read your mind over the internet to see what the actual commands passed to that ancient version of 'tar' might have been.

The files _in_ the archive are not likely to affected. The archive itself, however, must be a tar file for this to work. And then there's the possibility that you may have been requesting a special feature of your tar version to recognize text files and treat them differently.

And 'bit rot' is quite a lot less likely the reason than a superficially unrelated change to your system having changed what actually gets executed when you call "tar" on the command line.

Reply to
Hans-Bernhard Bröker

Directly, not at all. Indirectly, quite widely.

This is happening when creating a tar archive so there is no previous content with which to be concerned. I have numerous canned processes that I've used for years and now they don't work at all.

I've found a later version of tar that I can try except that the developer of the update decided to package the components as a tar file. I can't untar the package to get the updated tar!

Reply to
Everett M. Greene

... snip ...

A tar file is an extremely simple thing. The content files themselves are simply copies, preceded by an indexing prefix. The prefix describes such things about the file as dates, length, etc. There may be an overall prefix which summarizes all the prefixes. So extraction, after deciding which file to extract, is simply copying a portion of the tar file.

Some tar files are also compressed, and these are normally marked as .tgz files, or as .tar.gz files. Uncompressing the .tgz with gzip will yield a tar file, and the above applies.

.zip files are generally a better (and newer) mechanism. But that does not address your problem.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.
Reply to
CBFalconer

Why did you not this? Do you seriously expect people to be able to help without seeing any actual command line input or error message output from that mysterious, anonymous "tar" you're referring to?

And you consider them so sacred that you can't show any of them?

You missed my point. You say that "now" it no longer works --- so something must have changed between "then" and "now". Since you left us with nothing to work on, you'll have to find out yourself what that something is. It may help to start with "when?", then proceed to "what did I do then?"

It can hardly be bit rot of the software itself, so the change is probably in your system enviromnent. E.g. you could unknowingly have installed *another* program also called "tar", which now gets called under that name, but does a totally different job, or expect options in a different format. Or you could have changed the system-wide presets of your tar program.

Reply to
Hans-Bernhard Bröker

Edit it as a binary file, and pick off the pieces.

--
ArarghMail802 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the extra stuff from the reply address.
Reply to
ArarghMail802NOSPAM

zip files are *not* "generally a better mechanism". Zip files and tgz (or tar.bz2) files have different strengths and weaknesses, and are used in different places. One difference is that zip files are more common in the DOS+Windows world, while tar and tar.gz are more common in the

*nix world. On a technical level, zip files consist of a bunch of files that are compressed, then bundled together, while tgz files consist of a bunch of files that are bundled together, then the whole bundle is compressed. This makes zip files better if you want to work with individual files in the archive (adding new files, extracting a few files), and means less memory will be needed for compressing or extracting files. tgz files will give better compression, especially for many similar files, and are better suited to streaming and pipelining usage (there is no "tar-gzip" program - the output of "tar" is piped directly into "gzip"). So neither format is "better", even though they overlap somewhat is usage.
Reply to
David Brown

I presented the question to see if anyone hard ever encountered such a problem in the past and jog their memory as to what they had encountered when the problem arose. I wasn't expecting anyone to have detailed trouble-shooting answers so I didn't bother to include specific commands.

FWIW: The only command I ever use is:

tar -c -f -T

A little digging in the source finds that tar is looking at file protection/permission bits to determine a file's "type". I produced a quicky program to see what the stat() function is obtaining. All the files I'm trying to archive show the expected rwed bits set along with a bit indicating that they are "regular" files. How this can be a wrong "file type" is mystifying.

[Late news: I had a copy of the Gnu tar source so I built a copy of tar with it with the idea of being able to use a debugger to get a better handle on the nature of the problem. When I ran it, it worked without a problem. So the problem is "resolved" although I don't know why the version I've been using for so long has suddenly failed.]
Reply to
Everett M. Greene

... snip ...

I have no disagreement with your facts, but I do feel your evaluation is faulty, especially as long as you are using gzip. bzip2 cam compress much better, but seems to upset those unfamiliar with it. When using gzip with tar, as compared to just zip, I think there is no replacement for the ability to extract individual files. In addition, zip gives the ability to add or replace files as required (although replacement hides the major complications).

Availability of utilities for packaging is about equal today.

At any rate this argument is much like arguing over code format. :-)

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.
Reply to
CBFalconer

... snip ...

I assume you are using Linux, and thus no defragging programs. This greatly reduces the possibility of problems. However, if you are lacking ECC memory, any copying of the original files can leave an undetected bit drop. The only detection method of which I am aware is a MD5sum file on everything concerned. This includes shared libraries.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.
Reply to
CBFalconer

I agree entirely about bzip2. But the ability to remove or add single files to a zip file is almost irrelevant in practice (unless my usage is widely abnormal) - virtually all zip files I make are packed and unpacked in one action. And if I have a tgz (or tar.bz2) file from which I just need a single file, any decent compressed file gui (such as

7zip on windows) will handle that fine.

Mostly the decision boils down to zip being natural on windows, and tgz (and friends) being the natural choice on *nix.

My point is merely that you can't claim one format is so much better than the other, when they can both do the same job, but with different strengths and weaknesses.

Yes, that's certainly true. Just like code formatting, there are many ways to do compression - you do it in your way, and I do it the right way!

mvh.,

David

Reply to
David Brown

This is not true. tar is very much aware of what type of file it handles and treats normal files, directory files, symbolic links, network mounts, char special devices etc. all differently. So if it cannot find out whether or not a certain file is a directory, it has every right to complain.

The first thing you should do is type file for offending files, and see what happens. Chances are that ``file'' comes with the same message and your trouble is not with tar.

Some ways I can imagine to have this is creating an invalid device, severe file system corruption or a botched network.

Groetjes Albert

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

Fragmentation of disks should *never* cause errors - that's a myth put about by defrag program vendors. Fragmentation can make disks slower than necessary, and possibly increase the disk wear slightly. But errors in reading and writing files are not caused by fragmentation - even FAT and NTFS do not get more unreliable when fragmented.

It's also a myth that Linux file systems don't get fragmented, and it's a myth that there are no defrag programs - ext2 has a defragmentation program (I can't remember off-hand if it works on ext3 too, or if you have to mess around with mounting as ext2), and xfs has a defrag program. Any good filesystem will allow fragmentation, so that you can continue to write large files even as the disk fills up, and so that you can append to existing files even if there is other data after the original file.

What makes the difference between Linux and Windows regarding fragmentation is that Linux does not have a totally brain-dead pseudo-random allocation policy like windows does - a Linux filesystem will fragment files when it is necessary, unlike windows which will start fragmenting files when there are more than about 3 files on the partition.

Corruption due to bad memory is certainly a possible cause. MD5 sums are not the only detection method (a direct compare would work too), but they are certainly a possible check.

Reply to
David Brown

That's what I meant. Badly phrased. ... snip ...

The thing that really needs watching, and that only ECC can do, is non-repeatable errors, possibly due to cosmic rays. These can flip a single critical bit at a critical time, and leave no evidence. That's why defragging without ECC is scary.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.
Reply to
CBFalconer

Maybe with defragging this is true, but surely they leave evidence, under certain circumstances. A bit flip in the torture test of the gimp prime95 is detected all right. This gives an idea of how often this occurs. Any bit flip in a gimp process leads to errors that are detected.

Cosmic rays are not scary. Virtually 100% of errors in GIMP are at least detected. Running gimp programs for years I have not had a single error. If you have parity ram all single bit cosmic errors will halt your machine. There is a reason parity ram is abandoned, it just never happens in practice.

Disclaimer: I'm not on a spacecraft.

N.B. The OP mentions MD5 but of course any crc check (32 or 16) bit is probably good enough. For an embedded system 16 bit crc is quite practical.

Groetjes Albert

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

... snip ...

Agreed. About 20 years ago I wrote a program, validate, which (on DOS) installs a 16 bit CCITCRC checksum at the end of a program file, extending it by 2 bytes. The result always yields a 0 checksum, and goes everywhere with the file. Source lost in a disk crash, but the binary is still available below.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.
Reply to
CBFalconer

Not having a scintillator monitoring nearby I cannot state with any certainty the cause, but several machines using parity ram here get a parity error about every 18 months or so. The soil in our area is more radioactive than normal and cosmic rays may also play a role.

If you are suggesting that ECC is preferable to parity than I agree, but one could assume you meant that no error checking is necessary.

Michael

Reply to
msg

Also, the higher the altitude the worse the cosmic ray problem becomes. For 32 bit or wider memories ECC is just as economical as parity, although the controller may be more complex.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.
Reply to
CBFalconer

What OS?

Why haven't you upgraded your system?

Why are you posting on this newsgroup? (Why not alt.os.linux or something OS-specific?)

Dave.

Reply to
David T. Ashley

Well I leave it to you to decide what level of error checking and recovery is necessary for what you want to do. I just said that I understand the manufactures, because mister average Window user is not willing to pay for a machine that is sufficiently reliable that merely one error in 18 months happens (in my area that could be >100 years) and be notified on top of that.

As a side note, I do have in my possesion a Parsytec supercluster (64 transputers with 4 Mbyte ECC). If you are doing scientific calculations (physics: chromo quantum mechanics), and publish results that shake the world, you want that extra reliability.

In practice I found that when doing calculation for month on end, the transputer links are the weakest link. They have no error detection of any kind, and there is no easy way to recover, if a message is missed. The calculation hung numerous times, admittedly probably from one of the other boxes, not the Parsytec. (Still have to compare my twin prime count with the literature.)

Not very balanced, and neither is a windows system with ECC memory (IMHO).

Groetjes Albert

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.