How to use VCS (git) to save output binary files

Question

I'm very new to VCS and git, so I have some difficulties to use them.

I'd like to use git for my embedded projects. I consider a must have featur e the possibility to retrieve the binary file of an old release. In this wa y, I can reprogram a real device with EXACTLY the same binary after some ye ars since the release.

In order to do so, I think I need to add output binary files (maybe even ob ject files) to the repository. But this means the output of git status comm and will be cluttered by unuseful info (at every change in the source code, many object files change as well).

Any suggestion?

Paul Rubin · Accepted Answer

1) use .gitignore to suppress the status messages 2) If the files are large, consider git-annex to track them without    actually putting them in the git repo.

pozzugno · Answer

I knew .gitignore is used to remove files entirely from the tracking process (not only suppress status messages).

What do you mean with "large"? I don't think they are big files (I usually work with MCUs with internal Flash memory).

However I want to clarify I don't need to track binary files for every commit. It's not useful. I think it's better to have a full snapshot (including binaries) only for prodiction releases. I understand a production release should be tagged in git, so it is sufficient to push additional not tracked files (binaries) to tags.

Another possibility is to save the *full* directory of the project in another place. But in this case I'll have two copies of the source (one is in the git repo) and there will be a risk they aren't well synchronised.

Clifford Heath · Answer

If you use .gitignore to ignore binaries, you can still force-add them during your release build process so that those files only get committed on a release build. I would add this into a special target in your makefile so it's simple to make a release commit. Use:

git add --force some-file

Otherwise arrange a process that copies the released files into a parallel directory with its own git repository, add and commit them there, then tag the source code tree with the commit number of that commit. That way your source tree remains small, and you can always retrieve the exact source code used to build each released version.

Personally I much prefer the second option.

Clifford Heath.

David Brown · Answer

I often include the final output files (such as .hex, .bin or .elf) in the repositories, as an aid to getting exactly the same programming file later on - also to be able to check that a re-compilation produces the same results, and for convenience if I am developing on one machine but the programming is been done from a different system.

But I don't see any reason to keep object files, or any of the other temporary files that get generated - listing files, dependency files, etc. That would reduce your clutter enormously.

Another thing to consider, since you are new to version control, is if git is the right choice for you and your project. git is a great VCS, but it is also quite complicated and can be hard to learn and use well. And it is not very good with binary files. An alternative would be subversion, which has fewer features and capabilities, and a rather different philosophy, but which might be simpler, clearer and more appropriate for you. We mainly use subversion, and find it a better fit for more general development, both hardware and software, within small groups at the same location. For a couple of software-only projects that are co-operations across a number of different offices, git has advantages.

pozz · Answer

Il 10/09/2015 07:54, Clifford Heath ha scritto: Suppose the project tree has two sub-folders named Release and Debug  with all the files generated during the build process (listing, objects,  binaries, ...) that I don't need to track during normal commit. I would write the following lines in .gitignore (note trailing / for the  folder):    Release/    Debug/ The process to generate a commit and tag for a production release (with  binary) would be:    make all    git add --force Release/binary.hex    git commit -m "Commit for the production release 1.0"    git tag -a v1.0 -m "v1.0: first production release (with binary)" Now I start making some modifications for the next production release. I  think the next commit will have Release/binary.hex yet! Starting from  the commit of the production release, the files force-added will  continue to be tracked by git. Should I manually remove them after creating tag? I don't know if I understood correctly. What do you put in the "parallel...

pozz · Answer

Il 10/09/2015 09:45, David Brown ha scritto: This is my exact goal. The problem I see is during normal working. Before making a commit of the working tree after solving a bug, I check  which files I touched and will be committed in the repo. As you can understand, this log will be full of binaries. Now I understand your point. ...as I noted :-( Each VCS has pros and cons. After reading some docs, articles, forums,  blogs I thought git was the right choice for me: a full-featured and  modern VCS. But I'm not sure. Of course, I should try every VCS to make a good choice, but I don't  have time for that. So I read other suggestions. I'll check svn again. This isn't my case.

Clifford Heath · Answer

No. The new commit will not add the ignored files, even if you force-added those files previously.

Yes, just what is released, and any debug symbol libraries, etc. The sources are tagged in the source repository with the commit number of the binary repo.

It keeps the source directory clean and small, to speed up any operations which do not require the full history of released binaries (such as cloning a repo for automated testing). note that the repository where the binaries are maintained cannot be cloned without all binaries for *all* releases being copied into the .git/objects directory. Your source trees, and your test trees, and any experimental "spike" trees do not need that; not even your production build machines do. Only your customer support environment needs it.

Clifford Heath.

David Brown · Answer

Yes.  It is a mistake to try to put /all/ files into the repository. Put in all files that are actually needed to recreate the binary - i.e., the real source files.  This includes project settings files or other such details, even if they don't look like source code.  I also add some generated files if they are of particular convenience - such as the final executables or binaries, pdf files generated from LaTeX source, etc.  This means that other users on different machines can make use of the output files without having to rebuild everything themselves. But avoid backup files, temporary files, history files, log files, debugging files, dependency files, and other such clutter. You are absolutely write about the pros and cons of different systems. There is no doubt that git offers more features than svn - but if those features are not of use to you (and you don't expect them to be useful in the near future), then they become a cost in terms of learning and the possibility of mistakes....

Clifford Heath · Answer

If you're really serious about that, you do your production builds in a virtual machine, and archive a complete copy of that VM - including all compilers etc. That's what we used to do anyhow. But that was for legal escrow purposes, not for debugging problems in old releases of deployed software.

Clifford Heath.

pozz · Answer

Il 10/09/2015 11:29, Clifford Heath ha scritto: I have tried on a test repo and it seems to me that the file added with  "git add -f" is tracked for newer commit. git init echo main.hex >.gitignore echo "This is a source file" >main.c echo "This is a binary file" >main.hex git add * git add .gitignore git commit -m "First commit without binaries" echo "Some changes on main.c" >>main.c echo "v1.0 binary file" >main.hex git add -f main.hex git commit -a -m "First production release with binaries" git tag -a v1.0 -m "Version 1.0" echo "Other changes" >>main.c echo "Another new binary file" >main.hex git commit -a -m "First commit after production" Now, after "git checkout v1.0", main.hex content is correcly "v1.0  binary file". After "git checkout master", main.hex content is "Another new binary file". This means, main.hex *is* tracked for newer commits.

Theo Markettos · Answer

Just put your production release on a separate branch, and switch back to master, such as:

git commit -a -m "v1.0 release candidate" # commit final changes, check it works git checkout -b release-branch # create a release branch and switch to it

git checkout master # return to the master branch, with no main.hex # possibly merge desired changes from release into master branch # eg with git cherry-pick # (for those changes you made to the release at the last minute)

# committed back on master branch

I find committing FPGA bitfiles into my personal git repos is very handy - they compress well, and 10MB of repo space per commit is a good tradeoff against multi-hour synthesis times. In our more important projects this is managed by Jenkins so I don't need to handle it in git.

Theo

How to use VCS (git) to save output binary files

Join the Discussion

Didn't find your answer?