Tracking localization changes in repository

- D
- Don Y
  
  Contact options for registered users
posted
9 years ago

Sun, Dec 21, 2014 9:41 PM

Hi,

I run several OS's, here. I'd like to get *all* of them under CM/VCS. In the past, I've just tracked my NetBSD boxen as I've already got the source repository on-line, locally.

[Ignore, for the moment, my own sources...]

For the NBSD boxes, I've previously created a "localization" repository that holds the local changes/additions to my systems (tagged with a unique "Don-specific" tag so I can retrieve them).

This, for example, lets me track changes AND ADDITIONS to /etc, /usr/pkg/etc, etc. Much easier than trying to use scripteed methods to "update" after a new release.

But, here's the inconsistency: when I make changes to the sources for any of the executables, etc., those get tagged as "mine" and tracked in the "regular" NetBSD repository (local copy) -- as they *should* be! Note also that everything in the distribution is present in the repository... including the "sources" for /etc!

I.e., why am I treating my changes to /etc differently?

OToOH, I don't merge all my local changes into that repository when those changes aren't really pertinent to NetBSD "as a product". E.g., if I create a mount point for /ThumbDrive, it seems wrong to merge that in with the NetBSD portion of the repo...

Rather than *opinion*, I'd like to know how other folks WHO DILIGENTLY TRACK THEIR OS SOURCES do this.

[NB: NetBSD's repo is (presently) CVS based but that only slightly alters the rationale behind any methodology that folks might employ. It's really a question of how you view your local "customizations" wrt the OS "product".]

- D
- DecadentLinuxUserNumeroUno
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Dec 22, 2014 12:06 AM

Sounds very labor intensive.

I always load the most bleeding edge thing that came out on yesterday's "daily". Then, even that takes on many megabytes of updates online, so it changes a lot even on the daily.

My own hardware specific elements, like the new realtek wifi chip and USB

3 thingy, I keep seperate, but the archive is not even close to being cumbersome, so it has never been much of a problem.

I have been keeping mSSATA drives now, and they are lightning fast.

and then you would only need to plug in that stik for whatever machine, and upkeep of each becomes easier and more reliable as well. Investigate the real viability of USB hubs! Heheheh...

Do NOT, however, use spinning media formatted with NTFS. It is very bad on disconnect/reconnect events.

I think that USB has an inherent connectivity (hard connection) issue when it comes to storage media. Folks need little mini supercaps inside them or something. Dang!

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 31, 2014 1:53 AM

This problem massively affects provisioning of cloud servers, and lots of very smart people have worked (and still are working) on different (now relatively mature) solutions to it.

IOW, you're looking in the totally wrong place for the answers you seek.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 31, 2014 1:21 PM

It also affects *anyone* who takes CM/VC seriously.

I know folks here are familiar with VC in their work (whether it be writing code or designing hardware). Schematics have revision numbers, ditto for software modules, etc. No one would just "use" a piece of code or a schematic without being aware of whether or not they had the right *version* in their hands.

Similarly, are involved in CM implicitly -- even if they don't realize they are doing so in their daily activities. E.g., ensuring that their *tools* and *environments* are "controlled" so they can effectively do their work (you wouldn't take kindly to finding your tools had been changed, periodically, without your knowledge/consent; nor would *you* willingly change them without careful consideration and a disciplined approach to verifying the functionality of the new ones).

You wouldn't ship a product without ensuring that the "power supply" was appropriate for the "loads" it was supporting (e.g., using the power supply for one version of the product with the "loads" for another version -- that, presumably, required a *different* version of the power supply!).

And, I know many folks here use FOSS -- which makes VC and CM much more

*likely* (because you have far more ability to "change" those tools). [E.g., I suspect the effort put into controlling *binaries* boils down to "archiving" CD's of older tools in a desk drawer -- and digging through the stack when/if you ever need to return to an older version of that tool. I.e., UNLIKELY to be able to just connect to a local repository and "check-out" a tool of the appropriate vintage (along with associated configuration details) on-the-fly. "Click here to recreate the environment in which version X of product Y was released -- despite the fact that product Y is now at version BB and the tools used to create X have individually evolved through many successive revisions"]

What I had *hoped* is some FOSS users would be more "disciplined" in their approach to their toolchains instead of just "FOSS *users*" (and had availed themselves of modern tools to facilitate this). :<

From the correspondence that I've received, my approach seems as good as any other:

Some arguments suggest I move ALL of my localization changes into a single repository -- and consider that *purely* "mine" (with no intent on ever sharing its contents).

Others suggest the opposite extreme -- pulling ALL of my changes (even bug fixes to the "common" sources) *out* of the "main" repository so the repository can readily be updated to reflect the "authoritative" ones (without risk of *my* changes being lost in the update).

Either approach will work -- it just boils down to how I want to adjust my "process"...

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Jan 1, 2015 6:26 AM

You seem to be talking to me, but not responding to my comments at all.

All the "devops" folk I know use FOSS, and all use (and contribute to extending) very sophisticated automated CM systems for provisioning their cloud servers. The goal isn't always to provide identical VMs - that can be done just by cloning - but to build the VM with the required

*semantic* versions of all required components and configurations.

What's a CD? :) Why would you put data into a drawer? :P

You don't CM binaries. If you need identical binaries, you save the entire VM. Storage is cheap, time is not. Even 12 years ago we used to archive the entire build machine (VM) for every product release.

I think you're wrong to focus on archiving the changes. You should CM and archive the recipes which built the entire machine, and never alter a built VM (except to test a new recipe, from which you then build a new clean VM).

However, I don't think you'll get advice on how to do this from many folk here. Devops folk frequent different forums.

Clifford Heath

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Jan 1, 2015 8:34 AM

I was trying to explain the rationale for asking here. *My* usage is similar to theirs (SED/CAE) -- not those of server farms. There is a big difference in *scale* as well as in the frequency of configuration changes. And, the "process" involved.

I.e., I don't need an "order from above" to enable a service *now*... then disable it an hour from now because it was the wrong thing to do. Or, perhaps it was the right thing to do but not worth *recording*.

*I* also am ultimately responsible for deciding what can/will be shared and what won't -- not "Legal" or "Corporate" with lots of formal policies.

E.g., if someone is interested in how I've configured one of my window managers, I can check-out the latest version of its configuration file for it and email it to them. Or not. If a client needs a copy of some document that I delivered previously (I am amazed at how often this happens: "Can't you guys keep track of these things??"), I can drag it out of the archive as if it had been sitting on my disk/desk all along. I only need to know what version OF THAT DOCUMENT is required.

This is much heavier-weight than the issues I face. I want to be able to retrieve any "item" I've created (sources, illustrations, sound samples, firmware images, tool binaries, etc.) at any "version level" that is appropriate for the activity that I currently have to perform.

[That activity may include "make world"!]

Similarly, I want to be able to configure those tools and the environment in which they operate in a manner appropriate for that "application".

Yet, do this in an environment where others (unrelated to me) are also working on *some* of the same "items" at the same time. I want to be able to benefit from their actions as well as share the results of my own "appropriate" actions -- without also sharing things that are not worth sharing (or legally possible to share!).

E.g., my changes to portions of /etc would rarely be relevant to any other site. Backing them *out* of the portion of the repository that is "shared" would be problematic. So, keeping them in a separate repository allows me to track them as well as keeping the "regular" part of the repository available to be updated AND shared with those other parties.

Tonight I found a bug in one of the NIC drivers. The changes I make to the kernel sources will eventually migrate back to the NetBSD folks. So, I'll commit the changes to those sources *in* my local NetBSD repo (from whence the original sources used in my current kernel came).

But, I won't want (e.g.) the mail "aliases" that I've created to ever leave here (they wouldn't be of interest to any other user) -- or the configuration for my current window manager -- unless I explicitly went looking to export them.

*Which* build environment? If I am using several different versions of several different tools, do I save each possible combination of build environments? Or, do I check-out the compiler used for this version of this module, and the schematic capture package used for the version of the board on which it executes? Likewise, the fonts used in the documentation -- along with the tools used to prepare the documentation?

There are simply way too many combinations involved. The alternative is to build a simple machine for each individual tool. Then, you're faced with running one VM for the schematic capture tool (version X), another for the PCB layout tool (version M), another for the compiler, documentation prep tools, etc.

I've found it much easier to just have a generic machine (OS) onto which I can load checked-out binaries. Whichever tools I need and whichever versions. So, I can prepare documentation using a tool running on DOS (!) that pertains to a PCB layout done under W98, for code that I am now maintaining under NetBSD, etc.

Just like I can freely check-out version 1 of a User Manual and version

23 of the Reference Manual for the same product and have them coreside on the same machine in the same instant in time ("now") even though they may never have been "contemporaries" of each other. [The alternative is to redo the docs using a more modern tool, redo the PCB layout using a newer tool, etc. I'm not going to redo all of that work "on my dime"! Nor would I want to redo it even if paid to do so!]

This would be impractical for the reasons above. I'd be going through TB drives every week or two (because you're suggesting imaging the entire machine -- not just individual binaries!)

Presently, I recreate a particular snapshot by simply checking out everything tagged with a specific identifier associated with that snapshot. Or, just some portion of the repository with a different identifier (e.g., if I want the version of gcc present in the NEW_YEARS_2015 or PROJECT_FOO snapshot, then I can just ask for that -- binary or source).

[Of course, I have to impose some knowledge and discipline on what gets grouped together -- can't mix and match incompatible tools arbitrarily]

Again, I think they operate on a different scale and in a different environment. I ran this past a friend who manages a large server farm and he "couldn't relate" to the different ways I wanted to be able to "cut up" what I archived/restored. Why *this* would want to be shared while *that* wouldn't. In his world, the changes tend to be very linear/predictable. And, you just add whatever hardware and burn however many hours that it takes (it's not HIS money! :> )

I'd more imagine folks here to have to deal with questions like: "We got a call from Bob. He wants a change made to the FOO project hardware in order to support this particular software feature..." Unless you're fortunate (cursed?) enough to live with the same version of tools for all your projects, this has got to mean rewinding the WABAC machine to the point in time appropriate for Bob's FOO project and recovering the tools that allow you to *see* what you had done... THEN, recovering the tools that will allow you to *do* what you must now do!

E.g., I've been asked to recreate SMT versions of older thru-hole designs. Do I stick with the old, original tool? Or, avail myself of a newer release? Yet, I may need the original tool to access the design in a "portable" manner -- that I may not have anticipated needing when I archived the project (e.g., perhaps a netlist in some other format that the new tool supports but that I didn't NEED when originally using the old tool)