Обсуждение: Moving to git
Hi, I would like to move the source code from CVS to GIT. This will enable people to branch out bugfixes and new ideas. I asked for the git-repo on git.postgresql.org. But we could start with at repo on github.
On 10/02/2011 07:56 PM, Kjetil wrote: > Hi, > > I would like to move the source code from CVS to GIT. This will enable > people to branch out bugfixes and new ideas. > > I asked for the git-repo on git.postgresql.org. But we could start > with at repo on github. > Yay! Even if you use git.postgresql.org, I'd love to see a GitHub mirror. Their nice UI, push-pull functionality etc is great for keeping track of patches and for encouraging people to submit smaller changes. -- Craig Ringer
May I ask who you are ? While moving to git has been in the back of my mind the work involved to maintain the history from CVS is not trivial. I do find it a little presumptuous that your very first post to the list is to move the source code ?? Dave Cramer dave.cramer(at)credativ(dot)ca http://www.credativ.ca On Sun, Oct 2, 2011 at 7:56 AM, Kjetil <polpot78@gmail.com> wrote: > Hi, > > I would like to move the source code from CVS to GIT. This will enable > people to branch out bugfixes and new ideas. > > I asked for the git-repo on git.postgresql.org. But we could start > with at repo on github. > > > -- > Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-jdbc >
For what it's worth, I'll second this, although Dave's right--the post is a little presumptuous. Having worked on a number of svn-to-git transitions, the mechanics of preserving history are actually pretty trivial (git cvsimport / git svn clone make this dead easy), but doing the right thing with commit authorship, keyword expansion, empty folders, etc. is much trickier. And if you *don't* do that right off the bat, the entire "lineage" changes when you correct it later, making merges a nightmare (if you do them naively in this situation, you end up with two alternate timelines in the same repo). The transition needs to be done right (rather than running `git cvsimport` and throwing the result up on github). --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
FYI, there is a great system to do access control called gitolite. It is used on kernel.org. It could help keep things sane until the transition is very sane :) Marc-André Laverdière Software Security Scientist Innovation Labs, Tata Consultancy Services Hyderabad, India On 10/03/2011 08:02 AM, Maciek Sakrejda wrote: > For what it's worth, I'll second this, although Dave's right--the post > is a little presumptuous. Having worked on a number of svn-to-git > transitions, the mechanics of preserving history are actually pretty > trivial (git cvsimport / git svn clone make this dead easy), but doing > the right thing with commit authorship, keyword expansion, empty > folders, etc. is much trickier. And if you *don't* do that right off > the bat, the entire "lineage" changes when you correct it later, > making merges a nightmare (if you do them naively in this situation, > you end up with two alternate timelines in the same repo). The > transition needs to be done right (rather than running `git cvsimport` > and throwing the result up on github). > --- > Maciek Sakrejda | System Architect | Truviso > > 1065 E. Hillsdale Blvd., Suite 215 > Foster City, CA 94404 > (650) 242-3500 Main > www.truviso.com >
Dave Cramer <pg 'at' fastcrypt.com> writes: > While moving to git has been in the back of my mind the work > involved to maintain the history from CVS is not trivial. Can you elaborate a little bit? Maybe some others could help. Personally I used git-cvsimport to move a few CVS repos of mine to git and I was surprised how smooth it was. And it of course automatically rebuilds multi files commits out of CVS history automatically. Yet, I never used CVS as a pro and didn't branch[1] so I may have had simpler situations than JDBC source. Thanks Ref: [1] but according to Linus, no one branches when not using git right? :) -- Guillaume Cottenceau
> Dave Cramer <pg 'at' fastcrypt.com> writes: > >> While moving to git has been in the back of my mind the work >> involved to maintain the history from CVS is not trivial. > > Can you elaborate a little bit? Maybe some others could help. > > Personally I used git-cvsimport to move a few CVS repos of mine > to git and I was surprised how smooth it was. And it of course > automatically rebuilds multi files commits out of CVS history > automatically. Yet, I never used CVS as a pro and didn't > branch[1] so I may have had simpler situations than JDBC source. > Having watched the traffic on the Hackers mailing list during their migration to Git from CVS it seemed a difficult and lengthy process which included one aborted attemt. From memory the difficulties faced was preserving history correctly becuase of the introduction of "ghost" commits by the tool. Some threads of interest include: http://archives.postgresql.org/pgsql-hackers/2010-07/msg00217.php http://archives.postgresql.org/pgsql-hackers/2010-08/msg01117.php http://archives.postgresql.org/pgsql-hackers/2010-08/msg01077.php http://archives.postgresql.org/pgsql-hackers/2010-08/msg01247.php http://archives.postgresql.org/pgsql-hackers/2010-08/msg01263.php Regards, -- Mike Fowler Registered Linux user: 379787
On Mon, Oct 3, 2011 at 5:01 AM, Mike Fowler <mike@mlfowler.com> wrote: > >> Dave Cramer <pg 'at' fastcrypt.com> writes: >> >>> While moving to git has been in the back of my mind the work >>> involved to maintain the history from CVS is not trivial. >> >> Can you elaborate a little bit? Maybe some others could help. >> >> Personally I used git-cvsimport to move a few CVS repos of mine >> to git and I was surprised how smooth it was. And it of course >> automatically rebuilds multi files commits out of CVS history >> automatically. Yet, I never used CVS as a pro and didn't >> branch[1] so I may have had simpler situations than JDBC source. >> > > Having watched the traffic on the Hackers mailing list during their > migration to Git from CVS it seemed a difficult and lengthy process which > included one aborted attemt. From memory the difficulties faced was > preserving history correctly becuase of the introduction of "ghost" > commits by the tool. Some threads of interest include: > > http://archives.postgresql.org/pgsql-hackers/2010-07/msg00217.php > http://archives.postgresql.org/pgsql-hackers/2010-08/msg01117.php > http://archives.postgresql.org/pgsql-hackers/2010-08/msg01077.php > http://archives.postgresql.org/pgsql-hackers/2010-08/msg01247.php > http://archives.postgresql.org/pgsql-hackers/2010-08/msg01263.php > > Regards, > > -- > Mike Fowler > Registered Linux user: 379787 Well before anything happens we need to get buy in from Kris Jurka who has been shouldering most of the responsibility for the driver for the past few years. As I said it was presumptuous to just jump on the list and say "I want to move the source to git" without introducing oneself, or communicating with anyone here first. My comment about "thinking about it" was just that, I have contemplated it. I also have not talked to Kris or Oliver about moving to git. So before we get too far down the road I think it's important to have an open discussion about whether we want to move to git. Dave Cramer dave.cramer(at)credativ(dot)ca http://www.credativ.ca
My comment about "thinking about it" was just that, I have
contemplated it. I also have not talked to Kris or Oliver about
moving to git. So before we get too far down the road I think it's
important to have an open discussion about whether we want to move to
git.
I did one attempt to convince Kris to allow migrating to GIT http://markmail.org/message/diaht5koqznqcmlm
On Fri, 1 Apr 2011, Valentine Gogichashvili wrote:
> Kris, are there any plans to port the project to use git?
Yes, eventually. CVS is certainly ancient and with the server having moved to git, I see no reason for us not to follow. At the same time I don't really feel any sense of urgency because CVS is still functional. I'd say it would be a post 9.1 item to work on. If you followed the amount of work it took the server team to get the cvs->git conversion the way they wanted, it may not be as trivial as you'd hope.
Kris Jurka
So 9.1 is out, and one could think about starting the migration project actually :)
I also still think, that moving to git would be nice, but apperently, the fact, that in theory one can work with diffs, is enough to block the whole idea of migration as soon as there is something more important to do, that will be always true :)
So maybe it makes sence just to start a project of migration to git and prepare some scripts to do it, discuss pros and contras and keep a readonly copy of a git repository that will be fetching all CVS commits and push them to git repository for a while, then see, if it was what we want, and when we have a working thing, just start using it at some point?
-- Valentine Gogichashvili
On 03/10/11 19:04, Dave Cramer wrote: > On Mon, Oct 3, 2011 at 5:01 AM, Mike Fowler <mike@mlfowler.com> wrote: >> >>> Dave Cramer <pg 'at' fastcrypt.com> writes: >>> >>>> While moving to git has been in the back of my mind the work >>>> involved to maintain the history from CVS is not trivial. >>> >>> Can you elaborate a little bit? Maybe some others could help. >>> >>> Personally I used git-cvsimport to move a few CVS repos of mine >>> to git and I was surprised how smooth it was. And it of course >>> automatically rebuilds multi files commits out of CVS history >>> automatically. Yet, I never used CVS as a pro and didn't >>> branch[1] so I may have had simpler situations than JDBC source. >>> >> >> Having watched the traffic on the Hackers mailing list during their >> migration to Git from CVS it seemed a difficult and lengthy process which >> included one aborted attemt. From memory the difficulties faced was >> preserving history correctly becuase of the introduction of "ghost" >> commits by the tool. Some threads of interest include: >> >> http://archives.postgresql.org/pgsql-hackers/2010-07/msg00217.php >> http://archives.postgresql.org/pgsql-hackers/2010-08/msg01117.php >> http://archives.postgresql.org/pgsql-hackers/2010-08/msg01077.php >> http://archives.postgresql.org/pgsql-hackers/2010-08/msg01247.php >> http://archives.postgresql.org/pgsql-hackers/2010-08/msg01263.php >> >> Regards, >> >> -- >> Mike Fowler >> Registered Linux user: 379787 > > Well before anything happens we need to get buy in from Kris Jurka who > has been shouldering most of the responsibility for the driver for the > past few years. > > As I said it was presumptuous to just jump on the list and say "I want > to move the source to git" without introducing oneself, or > communicating with anyone here first. Yeah - sorry I jumped in so enthusiastically on that. I hadn't taken a good look at the From: field, and thought it was one of the main JDBC folks. FWIW, I'd love to see a move to git both for consistency with the main project and for the much, MUCH better facilities for external contribution. To what extent is *perfect* history reproduction required for PgJDBC? -- Craig Ringer
> To what extent is *perfect* history reproduction required for PgJDBC? Well, it's not about "perfect"--CVS and git are different systems and having the exact same semantics in both is not feasible (e.g., CVS's lack of atomic commits). But in general, this *is* a very good question: what would be the criteria for a transition? E.g., how would we handle something like the keyword expansion discussion that Mike linked? Because ancestry dictates commit hashes, fixing something like this after the fact would be a nightmare. Also, should this be the new go-to repo for *all* history, or is there any reason to keep CVS around for "archival" versions? I believe a git transition should support all history just fine and the old CVS repo could be mothballed (or set up to mirror git), but I could be missing something. Also, how would we validate the transition? E.g., something like passing the (tagged) test suite for each tag (on all supported-at-the-time Java versions?) could be a first step (assuming the suite from CVS passes, but I would hope that's a safe assumption), and probably some source-level diffs against corresponding CVS checkouts. That's easy enough to script and should give us a fair amount of confidence in the move. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
On Tue, 4 Oct 2011, Craig Ringer wrote: > To what extent is *perfect* history reproduction required for PgJDBC? > Perfect history is not a requirement. Largely because of the massive rewrite in the 8.0 timeframe there's not much utility in looking back much further. Still, imperfect history better have a good explanation for its existence. The server project had several suspect commits that they reconciled with their saved source trees rather than discarding. I don't think the JDBC driver has nearly as many suspect commits, but I don't see why it would be hard to accurately import our history. My only caveat is for those who think git-cvsimport is adequate: http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php Kris Jurka
So how does moving to git revolutionize the process ? I do understand that CVS is ancient but what changes if we move to git ? Dave Cramer dave.cramer(at)credativ(dot)ca http://www.credativ.ca On Tue, Oct 4, 2011 at 3:03 AM, Kris Jurka <books@ejurka.com> wrote: > > > On Tue, 4 Oct 2011, Craig Ringer wrote: > >> To what extent is *perfect* history reproduction required for PgJDBC? >> > > Perfect history is not a requirement. Largely because of the massive > rewrite in the 8.0 timeframe there's not much utility in looking back much > further. Still, imperfect history better have a good explanation for its > existence. The server project had several suspect commits that they > reconciled with their saved source trees rather than discarding. I don't > think the JDBC driver has nearly as many suspect commits, but I don't see > why it would be hard to accurately import our history. > > My only caveat is for those who think git-cvsimport is adequate: > > http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php > > Kris Jurka > > -- > Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-jdbc >
On 10/04/2011 04:35 PM, Dave Cramer wrote: > So how does moving to git revolutionize the process ? I do understand > that CVS is ancient but what changes if we move to git ? The main wins for committers are: - Local branches for work in progress code. Commit early, commit often. When you are ready to publish your changes to the world, you have options to clean up the history of your work so others can understand it better (and to remove those "oops" commits) with tools like squash merges and my favourite, "git rebase --interactive". I've found these tools incredibly helpful even in my private development. - A local copy of history and the current state of the public repo. It's easy to diff and revert without network access or delay. - Speed. Everything is fast. This really does make a difference, speaking as someone who's spent lots of time working with both cvs and svn as well. - Easier acceptance of changes from others via git format-patch, push/pull and/or GitHub merge requests. - Merges are so much less painful than with CVS. - Some really convenient little goodies like "git stash". For other contributors: - It's way, way easier to follow development - Personal feature branches are possible, practical, and easy. This makes developing bigger changes way easier and makes it practical to follow HEAD while doing so. - Speed. - It's trivial to produce a logical, sane and easily applied patch set using "git rewrite --interactive" and "git format-patch". -- Craig Ringer
>My proposal is this: > 1. Do a git cvsimport. > 2. Upload it to git.postgresql / github. I'm a huge git fan, but I vehemently disagree. Fixing history after the fact is an absolute nightmare. Dealing with broken history forever is not fun either. *If* (ok, more realistically, *when*) we migrate, let's do it once and do it right. If you really can't wait for it, it's easy enough to do a git cvsimport yourself and put it up on your own github account (then you get to deal with the problems of different lineages when the project migrates officially). --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
Hi, Sorry for being a bit in a rush. My name is Kjetil Nygård, and I'm a norwegian. My strongest and weakest attribute is my impatience. I have seen the discussion about moving to git.postgresql.org and/or github. But there it stalled as far as I see. That's why I dared to shoot from the hip. My proposal is this: 1. Do a git cvsimport. 2. Upload it to git.postgresql / github. When it comes to the concrete elements you mention, I would go for the simple solution. Namely just do an import, and upload it. Then fix problems as they occur. Authorship is preserved with git-cvsimport. The authors could maybe have been updated to username@somewhere.com, but I would not say it is very important. I guess that people will understand who jurka is anyway. About keyword-expansion I don't have any experience. (We just ignored it when we migrated from cvs to svn / git.) Empty folders should not be a problem. They can and should be made by the ant script anyway. So my proposal is to follow Nike and "Just do it!" PS: It's just a proposal. -Kny :-) On Sun, 2011-10-02 at 19:32 -0700, Maciek Sakrejda wrote: > For what it's worth, I'll second this, although Dave's right--the post > is a little presumptuous. Having worked on a number of svn-to-git > transitions, the mechanics of preserving history are actually pretty > trivial (git cvsimport / git svn clone make this dead easy), but doing > the right thing with commit authorship, keyword expansion, empty > folders, etc. is much trickier. And if you *don't* do that right off > the bat, the entire "lineage" changes when you correct it later, > making merges a nightmare (if you do them naively in this situation, > you end up with two alternate timelines in the same repo). The > transition needs to be done right (rather than running `git cvsimport` > and throwing the result up on github). > --- > Maciek Sakrejda | System Architect | Truviso > > 1065 E. Hillsdale Blvd., Suite 215 > Foster City, CA 94404 > (650) 242-3500 Main > www.truviso.com
Dear jdbc-team: 1. cvsimport keeps the history. It will not be lost. 2. But if there is changes that we want to do to the history, then we have to do it as we convert to git. Can you plz describe which you see as necessary? -Kny On Tue, 2011-10-04 at 09:13 -0700, Maciek Sakrejda wrote: > >My proposal is this: > > 1. Do a git cvsimport. > > 2. Upload it to git.postgresql / github. > > I'm a huge git fan, but I vehemently disagree. Fixing history after > the fact is an absolute nightmare. Dealing with broken history forever > is not fun either. *If* (ok, more realistically, *when*) we migrate, > let's do it once and do it right. If you really can't wait for it, > it's easy enough to do a git cvsimport yourself and put it up on your > own github account (then you get to deal with the problems of > different lineages when the project migrates officially). > > --- > Maciek Sakrejda | System Architect | Truviso > > 1065 E. Hillsdale Blvd., Suite 215 > Foster City, CA 94404 > (650) 242-3500 Main > www.truviso.com
On Tue, 4 Oct 2011, Kjetil Nyg?rd wrote: > 1. cvsimport keeps the history. It will not be lost. > My experience is that git-cvsimport does not work correctly. Or at least it didn't without a patched version of cvsps. Perhaps that's all been fixed up, but that's why it's important to do a verification that history has been accurately preserved rather than just publishing whatever results come out. http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php If anyone would like to work on the cvs to git conversion what I would like to see is a script/configuration file to invoke the conversion utility and then a script to do a comparison between any tag in cvs and the corresponding tag in the new git repository. Kris Jurka
On Tue, Oct 4, 2011 at 6:39 PM, Kris Jurka <books@ejurka.com> wrote: > On Tue, 4 Oct 2011, Kjetil Nyg?rd wrote: >> 1. cvsimport keeps the history. It will not be lost. > > My experience is that git-cvsimport does not work correctly. Or at least > it didn't without a patched version of cvsps. Perhaps that's all been > fixed up, but that's why it's important to do a verification that history > has been accurately preserved rather than just publishing whatever results > come out. > > http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php > > If anyone would like to work on the cvs to git conversion what I would > like to see is a script/configuration file to invoke the conversion > utility and then a script to do a comparison between any tag in cvs and > the corresponding tag in the new git repository. I also have bad experience with cvsimport - like getting first commit wrong, problems with tags/branches. Then I re-imported with cvs2git which worked perfectly, no messing around was needed. So I suggest you skip cvsimport and do the import with cvs2git. -- marko
So it would appear that there are some motivated people here with more experience than I. Is it feasible to take the source tree as it is an migrate it to github, or a personal git repo at git.postgresql.org then clone it to the pgjdbc git repo once the migration has been verified ? Dave Cramer dave.cramer(at)credativ(dot)ca http://www.credativ.ca 2011/10/4 Marko Kreen <markokr@gmail.com>: > On Tue, Oct 4, 2011 at 6:39 PM, Kris Jurka <books@ejurka.com> wrote: >> On Tue, 4 Oct 2011, Kjetil Nyg?rd wrote: >>> 1. cvsimport keeps the history. It will not be lost. >> >> My experience is that git-cvsimport does not work correctly. Or at least >> it didn't without a patched version of cvsps. Perhaps that's all been >> fixed up, but that's why it's important to do a verification that history >> has been accurately preserved rather than just publishing whatever results >> come out. >> >> http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php >> >> If anyone would like to work on the cvs to git conversion what I would >> like to see is a script/configuration file to invoke the conversion >> utility and then a script to do a comparison between any tag in cvs and >> the corresponding tag in the new git repository. > > I also have bad experience with cvsimport - like getting > first commit wrong, problems with tags/branches. > > Then I re-imported with cvs2git which worked perfectly, > no messing around was needed. > > So I suggest you skip cvsimport and do the import with cvs2git. > > -- > marko > > -- > Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-jdbc >
On Tue, Oct 4, 2011 at 5:42 PM, Dave Cramer <pg@fastcrypt.com> wrote: > So it would appear that there are some motivated people here with more > experience than I. Is it feasible to take the source tree as it is an > migrate it to github, or a personal git repo at git.postgresql.org > then clone it to the pgjdbc git repo once the migration has been > verified ? Yes, absolutely. The repo doesn't even have to be public initially (i.e., you just do the export by creating a new repo on your workstation and only share it when you're happy with it). The nice part about this is that we can attempt migration several times (if necessary) without having to babysit it, and abandoning a failed migration doesn't impact the current CVS repo. The only issue is that a full export makes the CVS server sweat a little bit. In fact, the process probably should be such that someone comes up with a converted repo and semi-automated verification steps. This can be inspected and verified by anyone interested, and once there's consensus, we can start talking about blessing that particular export as the new official one, and pushing that to the official repo. The CVS repo is fairly low-volume, so manually re-applying any intervening patches should not be a problem. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
On 10/5/2011 9:34 AM, Kjetil Nygård wrote: > Is it possible to put a snapshot copy of the cvs-directory somewhere, so > people who is interested can download the cvs-repo and set it up locally > for test-mirgrations? > http://ejurka.com/pgsql/tmp/pgjdbc-cvs.tar.gz Kris Jurka
On Tue, 2011-10-04 at 22:24 -0700, Maciek Sakrejda wrote: > Yes, absolutely. The repo doesn't even have to be public initially > (i.e., you just do the export by creating a new repo on your > workstation and only share it when you're happy with it). The nice > part about this is that we can attempt migration several times (if > necessary) without having to babysit it, and abandoning a failed > migration doesn't impact the current CVS repo. The only issue is that > a full export makes the CVS server sweat a little bit. > > In fact, the process probably should be such that someone comes up > with a converted repo and semi-automated verification steps. This can > be inspected and verified by anyone interested, and once there's > consensus, we can start talking about blessing that particular export > as the new official one, and pushing that to the official repo. The > CVS repo is fairly low-volume, so manually re-applying any intervening > patches should not be a problem. Is it possible to put a snapshot copy of the cvs-directory somewhere, so people who is interested can download the cvs-repo and set it up locally for test-mirgrations? Mvh Kny
On Wed, 2011-10-05 at 00:04 +0200, Marko Kreen wrote: > I also have bad experience with cvsimport - like getting > first commit wrong, problems with tags/branches. > > Then I re-imported with cvs2git which worked perfectly, > no messing around was needed. > > So I suggest you skip cvsimport and do the import with cvs2git. > Would it be an idea to try cvs -> svn first. And then use git-svn to import it? git svn seems stable as I use it at work :-) Mvh Kny
Verified that this works, and that the (filtered) diffs are clean for all tags. I've incorporated the script changes and pushed them out to the github repository I sent out earlier. I don't quite get the filtering expression here--can someone clarify? diff -r --exclude=CVS --exclude=.git . ../cvs-co/${tag} \ | grep "\$Header:" -v \ | egrep -v "^\-\-\-$" \ | egrep -v "^[0-9]+c[0-9]+" \ | grep -v "^diff \-r '\-\-exclude" Is this just filtering out the '$Header' in the Makefile from back in the day? Is there a better way to deal with this? It seems to get expanded to include my username and path to export. I'm fairly certain that with a judicious application of git filter-branch, I can strip out all those "manufactured commits" that Tom mentioned. The issues Kris mentioned [1] seem to be more complex, but I don't quite understand what they are. Can someone point out a specific bit of history that's mangled in Marko's github repo? I *think* it might be something like the problems discussed starting here [2] for the server migration, but it'd be helpful to see this going wrong in the jdbc history. Also, from reading the mailing list archives, it looks like tag-by-tag verification may *not* be a good way to verify sanity of full history. Any suggestions as to what a more extensive automated verification method might entail? We'd want to do manual spot-checks as well, of course, but is there something more we can do for an automated check? [1]: http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php [2]: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01274.php --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
On Wed, 2011-10-05 at 09:39 -0700, Kris Jurka wrote: > On 10/5/2011 9:34 AM, Kjetil Nygård wrote: > > > Is it possible to put a snapshot copy of the cvs-directory somewhere, so > > people who is interested can download the cvs-repo and set it up locally > > for test-mirgrations? > > > > > http://ejurka.com/pgsql/tmp/pgjdbc-cvs.tar.gz Hi, My original e-mail did not come through, because of an email with an image. I have now made two scripts. One that takes the dump above and convert it to git with svn2git. The second script verify that the content of the tags are the same. It is also a good idea to browse the git-non-bare with qgit to see if there is something wrong with the commit-tree. (See screenshot) To run the scripts: # Convert $ mkdir <an empty dir> $ cd <an empty dir> $ path/to/convert_cvs_git.sh ../pgjdbc-cvs.tar.gz # Verify $ export CVSROOT=<equal copy of the tar.gz> $ path/to/verify-tags-equal.sh Regards, Kny
Вложения
On Sat, 2011-10-08 at 01:15 -0700, Maciek Sakrejda wrote: > Verified that this works, and that the (filtered) diffs are clean for > all tags. I've incorporated the script changes and pushed them out to > the github repository I sent out earlier. I don't quite get the > filtering expression here--can someone clarify? > > diff -r --exclude=CVS --exclude=.git . ../cvs-co/${tag} \ > | grep "\$Header:" -v \ > | egrep -v "^\-\-\-$" \ > | egrep -v "^[0-9]+c[0-9]+" \ > | grep -v "^diff \-r '\-\-exclude" > > Is this just filtering out the '$Header' in the Makefile from back in > the day? Is there a better way to deal with this? It seems to get > expanded to include my username and path to export. I guess so. I just choose to ignore it, as it seemed autogenerated and comments. > I'm fairly certain that with a judicious application of git > filter-branch, I can strip out all those "manufactured commits" that > Tom mentioned. The issues Kris mentioned [1] seem to be more complex, > but I don't quite understand what they are. Can someone point out a > specific bit of history that's mangled in Marko's github repo? I > *think* it might be something like the problems discussed starting > here [2] for the server migration, but it'd be helpful to see this > going wrong in the jdbc history. > > Also, from reading the mailing list archives, it looks like tag-by-tag > verification may *not* be a good way to verify sanity of full history. > Any suggestions as to what a more extensive automated verification > method might entail? We'd want to do manual spot-checks as well, of > course, but is there something more we can do for an automated check? > > [1]: http://archives.postgresql.org/pgsql-www/2008-12/msg00124.php > [2]: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01274.php After reading [2] and similar documentation about cvs, I come to a conclusion in two parts. a) Because CVS tags single files instead of a branch, it is impossible to convert them to true git tags. I consider this bad design in CVS. b) Because of this design flaw, I suggest we move to git ASAP. PS: Does it really matter if the history is a bit altered by the conversion process? As long as HEAD and the tags are the same? The history could be kept in a read-only cvs-repository :-) Mvh Kny
> PS: Does it really matter if the history is a bit altered by the > conversion process? As long as HEAD and the tags are the same? I think we can tolerate some minor discrepancies, but we do want 1. Diff-free HEAD, tags, and tips of branches 2. A sensible-looking development history that represents more or less what happened in CVS If we only pay attention to (1), we lose too much information, since, e.g., development history can be critical in determining when a bug was introduced and what releases it affects. > The history could be kept in a read-only cvs-repository :-) Or we could keep a git mirror and you submit patches that are applied to the CVS repo, which you then pull into your repo through git-cvsimport once they're committed. There are all sorts of technical tricks we can play, but I think the goal is to minimize that, and essentially have a git repo representing the entire development history as if pgjdbc had been using git from day one. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
On Sat, Oct 8, 2011 at 3:06 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote: >> PS: Does it really matter if the history is a bit altered by the >> conversion process? As long as HEAD and the tags are the same? > > I think we can tolerate some minor discrepancies, but we do want > > 1. Diff-free HEAD, tags, and tips of branches > 2. A sensible-looking development history that represents more or less > what happened in CVS > > If we only pay attention to (1), we lose too much information, since, > e.g., development history can be critical in determining when a bug > was introduced and what releases it affects. > >> The history could be kept in a read-only cvs-repository :-) > > Or we could keep a git mirror and you submit patches that are applied > to the CVS repo, which you then pull into your repo through > git-cvsimport once they're committed. There are all sorts of technical > tricks we can play, but I think the goal is to minimize that, and > essentially have a git repo representing the entire development > history as if pgjdbc had been using git from day one. > > --- > Maciek Sakrejda | System Architect | Truviso > > 1065 E. Hillsdale Blvd., Suite 215 > Foster City, CA 94404 > (650) 242-3500 Main > www.truviso.com > Are we at a state where this is workable ? Dave
On Wed, 2011-10-12 at 11:28 -0400, Dave Cramer wrote: > On Sat, Oct 8, 2011 at 3:06 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote: > >> PS: Does it really matter if the history is a bit altered by the > >> conversion process? As long as HEAD and the tags are the same? > > > > I think we can tolerate some minor discrepancies, but we do want > > > > 1. Diff-free HEAD, tags, and tips of branches > > 2. A sensible-looking development history that represents more or less > > what happened in CVS > > > > If we only pay attention to (1), we lose too much information, since, > > e.g., development history can be critical in determining when a bug > > was introduced and what releases it affects. > > > >> The history could be kept in a read-only cvs-repository :-) > > > > Or we could keep a git mirror and you submit patches that are applied > > to the CVS repo, which you then pull into your repo through > > git-cvsimport once they're committed. There are all sorts of technical > > tricks we can play, but I think the goal is to minimize that, and > > essentially have a git repo representing the entire development > > history as if pgjdbc had been using git from day one. > > > > --- > > Maciek Sakrejda | System Architect | Truviso > > > > 1065 E. Hillsdale Blvd., Suite 215 > > Foster City, CA 94404 > > (650) 242-3500 Main > > www.truviso.com > > > > > > Are we at a state where this is workable ? I think that either my conversion (the simple one.) is good to go. It could be nice to look at the merge-history and see if the tagging / branching is sensible in gitx og gitk. On the other side, I have not looked at Marko's conversion, but I pressume that it works as well. (Or maybe better.) Regards, Kny
> Are we at a state where this is workable ? I'm happy to try to strip the dummy commits from Marko's conversion using git filter-branch, but I'm travelling right now, so that probably wouldn't happen for three weeks and change. Other than that, using his export scripts, I was able to validate that the contents of all *tags* are identical. If someone can provide other validation rules, I'm happy to work toward meeting them (although, again, 3+ weeks). If someone wants to do that cleanup independently, git filter-branch is a little intimidating, but not exactly rocket science. It's sort of like a sed expression applied to an entire branch history (or history of all branches). My plan was to look for commits with the dummy author (since I believe those are the only dummy ones) in GIT_AUTHOR_NAME, and exclude those. Yes, this is a little hand-wavy; I won't promise this will go smoothly ;) . That said, this is relatively basic and everything that I've ever had to do with git filter-branch, I was able to do by adapting the examples on the man page. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com