How to use VCS (git) to save output binary files

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I'm very new to VCS and git, so I have some difficulties to use them.

I'd like to use git for my embedded projects. I consider a must have featur
e the possibility to retrieve the binary file of an old release. In this wa
y, I can reprogram a real device with EXACTLY the same binary after some ye
ars since the release.

In order to do so, I think I need to add output binary files (maybe even ob
ject files) to the repository. But this means the output of git status comm
and will be cluttered by unuseful info (at every change in the source code,
 many object files change as well).

Any suggestion?

Re: How to use VCS (git) to save output binary files
snipped-for-privacy@gmail.com writes:
Quoted text here. Click to load it

1) use .gitignore to suppress the status messages
2) If the files are large, consider git-annex to track them without
   actually putting them in the git repo.

Re: How to use VCS (git) to save output binary files
Quoted text here. Click to load it

I knew .gitignore is used to remove files entirely from the tracking process (not only suppress status messages).

Quoted text here. Click to load it

What do you mean with "large"? I don't think they are big files (I usually work with MCUs with internal Flash memory).  

However I want to clarify I don't need to track binary files for every commit. It's not useful.
I think it's better to have a full snapshot (including binaries) only for prodiction releases.
I understand a production release should be tagged in git, so it is sufficient to push additional not tracked files (binaries) to tags.

Another possibility is to save the *full* directory of the project in another place. But in this case I'll have two copies of the source (one is in the git repo) and there will be a risk they aren't well synchronised.

Re: How to use VCS (git) to save output binary files
On 10/09/15 15:39, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it


If you use .gitignore to ignore binaries, you can still force-add them  
during your release build process so that those files only get
committed on a release build. I would add this into a special target in  
your makefile so it's simple to make a release commit. Use:

git add --force some-file

Otherwise arrange a process that copies the released files into a  
parallel directory with its own git repository, add and commit them  
there, then tag the source code tree with the commit number of that  
commit. That way your source tree remains small, and you can always  
retrieve the exact source code used to build each released version.

Personally I much prefer the second option.

Clifford Heath.

Re: How to use VCS (git) to save output binary files
Il 10/09/2015 07:54, Clifford Heath ha scritto:
Quoted text here. Click to load it

Suppose the project tree has two sub-folders named Release and Debug  
with all the files generated during the build process (listing, objects,  
binaries, ...) that I don't need to track during normal commit.

I would write the following lines in .gitignore (note trailing / for the  
folder):
   Release/
   Debug/

The process to generate a commit and tag for a production release (with  
binary) would be:

   make all
   git add --force Release/binary.hex
   git commit -m "Commit for the production release 1.0"
   git tag -a v1.0 -m "v1.0: first production release (with binary)"

Now I start making some modifications for the next production release. I  
think the next commit will have Release/binary.hex yet! Starting from  
the commit of the production release, the files force-added will  
continue to be tracked by git.
Should I manually remove them after creating tag?


Quoted text here. Click to load it

I don't know if I understood correctly.

What do you put in the "parallel directory"? Sources *and* binaries, or  
only the binaries? I think only the binaries.

In this case, why to have a repository for the "parallel directory"?  
What is the goal to track released files in the "parallel directory"  
with git?
I would tend to copy the released files in a new folder for each  
release, without tracking. The difference between two successive  
releases isn't important.


Re: How to use VCS (git) to save output binary files
On 10/09/15 18:17, pozz wrote:
Quoted text here. Click to load it

No. The new commit will not add the ignored files, even if you  
force-added those files previously.

Quoted text here. Click to load it

Yes, just what is released, and any debug symbol libraries, etc.
The sources are tagged in the source repository with the commit number
of the binary repo.

Quoted text here. Click to load it

It keeps the source directory clean and small, to speed up any  
operations which do not require the full history of released binaries  
(such as cloning a repo for automated testing). note that the repository  
where the binaries are maintained cannot be cloned without all binaries  
for *all* releases being copied into the .git/objects directory. Your  
source trees, and your test trees, and any experimental "spike" trees do  
not need that; not even your production build machines do. Only your  
customer support environment needs it.

Quoted text here. Click to load it

Clifford Heath.


Re: How to use VCS (git) to save output binary files
Il 10/09/2015 11:29, Clifford Heath ha scritto:
Quoted text here. Click to load it

I have tried on a test repo and it seems to me that the file added with  
"git add -f" is tracked for newer commit.

git init
echo main.hex >.gitignore
echo "This is a source file" >main.c
echo "This is a binary file" >main.hex
git add *
git add .gitignore
git commit -m "First commit without binaries"
echo "Some changes on main.c" >>main.c
echo "v1.0 binary file" >main.hex
git add -f main.hex
git commit -a -m "First production release with binaries"
git tag -a v1.0 -m "Version 1.0"
echo "Other changes" >>main.c
echo "Another new binary file" >main.hex
git commit -a -m "First commit after production"


Now, after "git checkout v1.0", main.hex content is correcly "v1.0  
binary file".
After "git checkout master", main.hex content is "Another new binary file".

This means, main.hex *is* tracked for newer commits.


Re: How to use VCS (git) to save output binary files
Quoted text here. Click to load it

Just put your production release on a separate branch, and switch back to
master, such as:

Quoted text here. Click to load it
git commit -a -m "v1.0 release candidate" # commit final changes, check it works
git checkout -b release-branch # create a release branch and switch to it
Quoted text here. Click to load it
git checkout master # return to the master branch, with no main.hex
# possibly merge desired changes from release into master branch
# eg with git cherry-pick
# (for those changes you made to the release at the last minute)
Quoted text here. Click to load it
# committed back on master branch


I find committing FPGA bitfiles into my personal git repos is very handy -
they compress well, and 10MB of repo space per commit is a good tradeoff
against multi-hour synthesis times.  In our more important projects this is
managed by Jenkins so I don't need to handle it in git.

Theo

Re: How to use VCS (git) to save output binary files
Il 10/09/2015 15:00, Theo Markettos ha scritto:
Quoted text here. Click to load it

If I'd like to push *everything* (binaries, object files, ...) in the  
production branch, I could do:

make all
git commit -a -m "v1.0 RC1"  # Check if it works
git checkout -b v1.0         # It works, create a branch for production
rm .gitignore
git add *                    # Add every file on production branch
git commit -m "v1.0"         # Update every file on production branch
git tag -a v1.0 -m "v1.0"    # Tag it as v1.0

git checkout master          # Switch back to master branch

What do you think?


Quoted text here. Click to load it


Re: How to use VCS (git) to save output binary files
Quoted text here. Click to load it

Seems sensible...

Theo

Re: How to use VCS (git) to save output binary files
On 10/09/15 22:45, pozz wrote:
Quoted text here. Click to load it

I tested it again with the commands I normally use (git add --update .)
and it does behave the way you said. I'm sorry, it seems I was wrong.

Anyhow, I still wouldn't want large binaries in my source repositories,
because every clone needs a copy. If you're careful to push them only
to a production branch, and not to checkout all branches into a new
clone, I guess you could get away with it - but "git clone" will
normally fetch all branches.

Clifford Heath.

Re: How to use VCS (git) to save output binary files
On 10/09/15 07:18, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it

I often include the final output files (such as .hex, .bin or .elf) in
the repositories, as an aid to getting exactly the same programming file
later on - also to be able to check that a re-compilation produces the
same results, and for convenience if I am developing on one machine but
the programming is been done from a different system.

But I don't see any reason to keep object files, or any of the other
temporary files that get generated - listing files, dependency files,
etc.  That would reduce your clutter enormously.

Another thing to consider, since you are new to version control, is if
git is the right choice for you and your project.  git is a great VCS,
but it is also quite complicated and can be hard to learn and use well.
 And it is not very good with binary files.  An alternative would be
subversion, which has fewer features and capabilities, and a rather
different philosophy, but which might be simpler, clearer and more
appropriate for you.  We mainly use subversion, and find it a better fit
for more general development, both hardware and software, within small
groups at the same location.  For a couple of software-only projects
that are co-operations across a number of different offices, git has
advantages.

Re: How to use VCS (git) to save output binary files
Il 10/09/2015 09:45, David Brown ha scritto:
Quoted text here. Click to load it

This is my exact goal. The problem I see is during normal working.
Before making a commit of the working tree after solving a bug, I check  
which files I touched and will be committed in the repo.
As you can understand, this log will be full of binaries.


Quoted text here. Click to load it

Now I understand your point.


Quoted text here. Click to load it

...as I noted :-(


Quoted text here. Click to load it

Each VCS has pros and cons. After reading some docs, articles, forums,  
blogs I thought git was the right choice for me: a full-featured and  
modern VCS. But I'm not sure.

Of course, I should try every VCS to make a good choice, but I don't  
have time for that. So I read other suggestions.

I'll check svn again.


Quoted text here. Click to load it

This isn't my case.



Re: How to use VCS (git) to save output binary files
On 10/09/15 10:25, pozz wrote:
Quoted text here. Click to load it

Yes.  It is a mistake to try to put /all/ files into the repository.
Put in all files that are actually needed to recreate the binary - i.e.,
the real source files.  This includes project settings files or other
such details, even if they don't look like source code.  I also add some
generated files if they are of particular convenience - such as the
final executables or binaries, pdf files generated from LaTeX source,
etc.  This means that other users on different machines can make use of
the output files without having to rebuild everything themselves.

But avoid backup files, temporary files, history files, log files,
debugging files, dependency files, and other such clutter.

Quoted text here. Click to load it

You are absolutely write about the pros and cons of different systems.
There is no doubt that git offers more features than svn - but if those
features are not of use to you (and you don't expect them to be useful
in the near future), then they become a cost in terms of learning and
the possibility of mistakes.

You can think of subversion as giving you a linear history of snapshots
of your project directory.  (It does have branches, merging, etc., but
they are more complicated to use in svn.)  Each checkin is logically a
full new snapshot (but handled more efficiently than a full copy, of
course).

git handles multiple branches and paths - it gives you another
dimension.  Checkins are logically changes or patchsets, rather than
snapshots.  This makes branches and merging a natural and critical part
to git.

If you think you will work with multiple parallel branches, and move
changes between those branches, then go for git.  If that is likely to
be a rarity, keep it simple with svn.

The other key difference is that subversion should always have a single
central server.  For git, you can use many servers or no servers, or a
single central server.  That means more flexibility - and more scope for
getting confused and losing track of what you have where.  Arguably, git
requires more discipline to use reliably and safely.

Finally, svn is equally happy with Linux and Windows, and gui and
command lines, while git is more at home in Linux and with the command
line (though gui clients and Windows versions are now common, if you
look at web tutorials or other sources of information, it will mostly
assume Linux and the command line).  It is more common to find plugins
and support for Subversion than git on Windows programs.

Quoted text here. Click to load it


Re: How to use VCS (git) to save output binary files
Am 10.09.2015 um 11:31 schrieb David Brown:

Quoted text here. Click to load it

Exactly.

The general rule is: if human beings didn't create, nor routinely modify  
it by hand, it doesn't go into version control.  I.e. VC tracks your  
_work_, not the final product.

One should only deviate from that rule for files that

* not everybody in the team is equipped to re-build,
* just take to darn long to build, or
* are needed to bootstrap the first build of a fresh check-out

It's a good idea to set up and maintain (some variation of) a "make  
distclean" procedure.  This removes all files that can be re-generated  
from others, except possibly those already on the "ignore list".  If any  
changes still show after make distclean, that means something has to be  
checked into VCS: the modified file, or an update to the VC's ignore  
list, or the distclean script itself.

Final output files (hex, map, build report, ...) that are deemed worth  
keeping for every "release" version should be checked in either in a  
separate system (see "configuration management"), or at least in a  
separate place / project within the VCS: the release archive.  To this  
end these files, in their build-time location, should be on the ignore  
list of the VCS, so they never show up as modified.  Because they'll  
practically _always_ be modified, there's really no point displaying  
them as such, nor should they ever be checked in at that location.

Then the release procedure can become roughly the following:

* crank up the version number
* make ; make check
* copy the few(!) "sacred" release files into the release archival place
* add those new files to the VCS (if archive is in VCS)
* make distclean
* check in everything that shows up
* put your VC's version of a "label" or "checkpoint" onto it

Then you build it again, and verify that you got the exact same hex  
files (and only trivial differences in the other "sacred" ones) as the  
ones just copied into the Release archival place.

One radically different approach is to keep all generated, volatile  
files outside the source tree entirely.  This is referred to by some as  
an "out-of-tree build".  This means all those objects, listfiles,  
libraries, mapfiles and hexes never show up in VC status listings to  
begin with.

Quoted text here. Click to load it

Suffice it to say that those are not criteria newbies should be worrying  
themselves about in a VCS.

What good is "modern" in a system whose primary purpose of existence is  
to still be useful to you when you need it, say, 20 years from now?  It  
surely will no longer be modern, then!

What good is "full-featured" to a newbie, who doesn't have the  
background to understand the majority of those features, let alone why  
one possible implementation of them might be better than another?

Quoted text here. Click to load it

No, you shouldn't.  You should pick one that fulfills what requirements  
you actually have and isn't already widely ridiculed as totally  
bone-headed.  And you should plan to _stick_ with that choice for the  
foreseeable future.

If ever it's time for you to re-evaluate that decision, you'll know:  
there will be this nagging pain in the lower back that you can no longer  
ignore.  The bonus of having stuck to your original choice until that  
point is that by then you'll have the experience to know what really to  
look for in the replacement: remedy for that pain, to start with ;-)

Re: How to use VCS (git) to save output binary files

 > [...]
Quoted text here. Click to load it

IMHO this is the best solution. After thinking about the goal of a VCS,  
it is a nonsense to add in the repository the binaries. The release  
files (hex, listings, ...) should be maintained in a different place.

The only link between the release archive and the source tree is the  
name of the release: the commit in the VCS that corresponds to the  
officiale release should be named/tagged in the same way as the release  
archive (v1.0, for example).

Of course, considering the process is manual, there is the risk the  
build result of v1.0 tag in VCS isn't the same as the files in v1.0  
official release archive. The developer should be disciplined to avoid  
this problem. Maybe some scripts should help.


Re: How to use VCS (git) to save output binary files
On 10/09/15 17:45, David Brown wrote:
Quoted text here. Click to load it

If you're really serious about that, you do your production builds in a  
virtual machine, and archive a complete copy of that VM - including all  
compilers etc. That's what we used to do anyhow. But that was for legal  
escrow purposes, not for debugging problems in old releases of deployed  
software.

Clifford Heath.


Re: How to use VCS (git) to save output binary files
On Wed, 09 Sep 2015 22:18:40 -0700, pozzugno wrote:

Quoted text here. Click to load it

If your process is such that "official" binary releases are seldom,  
archive them separately.  The "one floppy set per version" method of  
version control sucks for keeping track of code during development, but  
it's not a bad way to make sure that the customer gets what they expect.

--  

Tim Wescott
Wescott Design Services
We've slightly trimmed the long signature. Click to see the full one.
Re: How to use VCS (git) to save output binary files
Quoted text here. Click to load it

One option for that is to use Subversion for the 'golden master' versions
including binaries, and git for day-to-day source code development
(including random throwaway branches and experiments).  git-svn is a good
bridge between the two.  SVN sucks for daily use, but using it as something
you interact with infrequently avoids much of the suckiness.

Theo

Re: How to use VCS (git) to save output binary files
Il 10/09/2015 07:18, snipped-for-privacy@gmail.com ha scritto:
Quoted text here. Click to load it

I found a nice article here:  
http://nvie.com/posts/a-successful-git-branching-model/


In this workflow, there is a well-known Release branch that has the goal  
to prepare the official release (fixing documentation, small bugs,  
version numbering, ...). After the release is ready, it is merged in the  
master *and* develop branch.

Finally we have all the official releases in the master branch, each  
tagged with its version number. In this process, there is a well defined  
step that creates the new official release: it is the merging of the  
last Release branch with master branch. This step could be changed to  
add other files (binaries, listings, maps, ...) to the result merge  
commit in the master branch.

Of course, a developer will download all the binaries, together with the  
sources, when it clones from the central repository. I don't know if it  
is possible, but the developer interested only in... developing  
(sources), could clone all the repository, but the master (official  
releases) branch. Maybe "git clone" command can do this.


Just a final note on the VCS to use for a newbie as me. Most of you  
suggested to avoid git for its complexity. It is true, I found some  
difficulties to understand the concepts behind git. Maybe other VCS,  
like SVN, are simpler to understand.

But I understood git introduced a very good practice absent in other  
VCS. When you decide to start working on a new feature, you fire up a  
new branch, work on it for days, at last merge on the main branch.
With SVN branches aren't used for daily work, but only for exceptional  
operations.



Site Timeline