Linux Command-Line / Make puzzle

Somewhat OT, but you're a smart bunch of guys:

I'm doing some work on my web site, and trying to automate the process to separate the source of the whitepapers I post from the results (which should be, more and more, a bunch of pdf files).

So the way that I've set things up is that I have my "legacy" files in a directory structure that echoes the directory structure of my site, plus (thanks, SVN!) a bunch of hidden files that pertain to my version control system. The hidden files, but none of the desired files, have the string ".svn" somewhere in their path.

What I want to do is copy, wholesale, that whole directory structure, _excluding_ the version control files.

It'd be really cool to have something like 'cp', only that sorts by a regular expression. I've been trying various combinations of find, sed, and perl, and the complexity seems to be growing, rather than diminishing.

Is there some easy way of doing this that I'm missing? Is this a problem that any of you have solved already? Clearly, I can just write a little app that'll do it, but if there's some two-line solution out there then I'm open to suggestion.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html
Reply to
Tim Wescott
Loading thread data ...

You could use 'tar', with the --exclude-vcs option

tar cf - --exclude-vcs -C srcdir files | tar xf - -C dstdir

Reply to
Arlet Ottens

Well, Tim, if you'd do a more thorough web search before you post your ignorance all over the world, you'll find that putting the following lines in your makefile may be just the ticket:

cd pages find . -type d -not -path "*.svn*" -exec mkdir -p ../deploy/{} \; find . -type d -not -path "*.svn*" -exec cp -rf {} ../deploy{} \; cd ..

Man, people can be so _dense_.

(It works a charm -- just in case someone else had this problem).

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html
Reply to
Tim Wescott

How about a two-step process:

  1. Copy everything.
  2. delete from the copy all the .svn files

Not having more than minimal experience with the linux command line stuff, I don't know how to implement part 2 for nested directories.

Mark Borgerson

Reply to
Mark Borgerson

1: Copy everything 1a: Have the build fail _right there_ 1b: Have all these duplicated version control files on your disk, poised and ready to make _serious_ trouble.

Not that I'm, like, paranoid or anything.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html
Reply to
Tim Wescott

or, even simpler:

rsync --cvs-exclude src dst

Reply to
Arlet Ottens

Use find(1) to grab all of the files of interest (either including or excluding based on a regex template) then copy (or symlink, if you would prefer) the originals into the new file hierarchy.

You'll find (heh heh) "{}" to be your friend when creating the exec script for find.

On some systems, there are mechanisms to build these mirrored hierarchies. E.g., some makes support this directly.

Reply to
D Yuniskis

You might also find -name "*.pdf" to be of use (or related).

And, you might prefer to use symlinks (or hardlinks, depending on your filesystem layout) instead of explicit copies. Note that some large packages build with this sort of "shadow hierarchy" (e.g., BSD kernels)

find(1) is something that *everyone* should keep in their toolkit. It's often a lot easier to hack together a command/script using it than it would be to try to remember which commands have recursive descent options, etc. (e.g., I always use it with grep)

Reply to
D Yuniskis

Normally, you *don't* want to copy all files, just update the ones that have changed. For that, you should use rsync.

For VCS, if you use Git or Mercurial, you can make and test changes locally, check them in, then push the changes to the VCS repository on your live site, and once that's done ok, checkout the live version. Another common deployment option is to have your site do a checkout from a web-accessable VCS server.

Clifford Heath.

Reply to
Clifford Heath

I only occasionally get the grep recursive syntax right -- I almost always get it right with find.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html
Reply to
Tim Wescott

It's a recent addition to grep(1) [or, am I thinking about some other command? :< ] Regardless, I find find(1) easier to grok -- I can think to myself: "OK, look at everything having a name that matches *this*, that also happens to be a *that* and do the_following with it's name inserted

*here*".

Of course, it is far more expensive than using a command that directly supports recursive descent through thte file hierarchy. But, I'm not running on a business server that has other

*real* users looking for CPU cycles.

Where I usually screw myself is with "-name" qualifications. E.g., forgetting files that begin with a period, etc. :<

Reply to
D Yuniskis

If your files already are under version control, why not simply use the version control system to generate the directory structure? 'svn export' seems to be a perfect match:

formatting link
# The second form [svn export [-r REV] PATH1[@PEGREV] [PATH2]] exports a # clean directory tree from the working copy specified by PATH1 into # PATH2. All local changes will be preserved, but files not under # version control will not be copied.

Disclaimer: I don't use SVN, but 'cvs export' works quite well for me for generating source tarballs.

Stefan

Reply to
Stefan Reuther

I second that recommendation.

I'd also add "--delete" and "--delete-excluded" options.

I don't know how big the collection is, but an advantage of using rsync is that it will only copy over files that it needs to - thus when you add or remove files from the source tree and run the rsync again, it does the minimum amount of work needed to make the copy.

Reply to
David Brown

Actually, it's not even formally an addition --- it's a non-portable extension found in one particular (although widely popular) implementation. The applicable official definitions (i.e. POSIX, SUS) of 'grep' don't have recursion.

It also goes a bit against the grain of traditional Unix tools, particularly the "one tool one task" principle. Finding files recursively is, by that principle, the job of find(1), not grep(1). But I guess people using these tools outside their proper context (e.g. on Windows) don't know the power of a proper tool _set_, and successfully lobbied for adding a "poor man's find(1)" into grep.

Reply to
Hans-Bernhard Bröker

How do you reliably pass the files from "find" to grep if they have embedded spaces etc? (Honest question).

--

John Devereux
Reply to
John Devereux

find -print0 | xargs -0

Reply to
Arlet Ottens

Cool, thanks!

--

John Devereux
Reply to
John Devereux

Copy? Come on, you only need a list of files.

find . > ape

You only need to delete the names from the list.

fgrep -v '\.svn' ape >ape2

Now you can pass the list to tar

tar cf ape.tar `cat ape2`

These are the infamous back quotes. `cat ape2` is a string with all the files.

(For education: very often it is as simple as adding a

-r , for recursive, option. )

Unix (linux) is the best Integrated Development System ever. Instead of forever searching an appropriate click application, you split what you want in steps. Each step is bound to be executable by a simple command.

(A guru doesn't need all those intermediate ape files and I know about xargs. Don't bother to tell me. )

Groetjes Albert

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

Not necessarily. "find ... -exec ..." can be quite expensive, but "find ... | xargs ..." shouldn't be.

Reply to
Nobody

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.