A vision for Hoa Git repositories


#1

Hello fellow Hoackers!

After a long discussion with @ashgenesis, we agreed that we spend too much time on Hoa infrastructure. I can list two of reasons: We stand on free services that are not always reliable, and we are a small team according to our ambitions.

Current state

In New hosting strategy, we discuss about a new hosting strategy. What we need to host:

  • Our Git repositories,
  • The site, and
  • an ElasticSearch instance.

What we no longer need to host:

  • Mailing-lists, because they have been migrated to this Discourse, and
  • Emails, that are hosted by Gandi.

What we have removed:

  • Download archives because most people use Composer now, this is our main source distribution vector,
  • Forum (merged within the Discourse instance).

The site and the ES instance can be hosted by CleverCloud as they agree to sponsor Hoa. Thanks! What we need to host is Git repositories.

Next state, a proposal

Here is the result of the discussion we had with @ashgenesis.

Why do we need to self-host Git repositories?

  1. For security,
  2. For control,
  3. For bots,
  4. For mirrors.

The proposal: Stop self-hosting Git repositories and migrate to Github.

Security

I don’t think we are better at security than Github. Yes we have some neat features for security, but it seriously does not increase the trust people have in Hoa, so I don’t think it worth the time we spent on it.

Control

We can manage all the repositories as we want, archives them, back up, serve them with cgit, back up tarballs etc. Again, it takes time, Github does all of that. More recently, Github has introduced new features to manage teams, and now it’s equivalent to what we have with Gitolite. Since there is no longer any benefits, why bothering with Gitolite, cgit & co.?

Bots

Historically, Hoa provides 4 installation flavors:

  1. Central, globally —aka system-wide— (in /usr/local/lib/Hoa),
  2. Central, locally —aka project-wide—,
  3. Each library globally,
  4. Each library locally.

It was in the pre-Composer era. As I said, now, Composer is our main distribution vector. The second one is Git for specific libraries. It’s been years I heard somone coming to me saying: “Eh, we installed Central”. It does not make sense anymore.

Bhoat is the nice bot that keeps Central and all other library repositories in-sync. Thank you Bhoat. When a PR on Github is merged in git.hoa-project.net, then it closes everything on Github & co. It’s nice.

However, why having Central today? We can keep the repository as an issue tracker (e.g. for RFC, or project-wide issues), but it no longer makes sense to merge all library in it. Having a Central repository no longer makes sense. In this context, if we kill Central, then Bhoat is useless. Then no bot. Then we can move to Github.

On Github, we can use Bors, https://bors.tech/. @ashgenesis tried to install it locally in the past, but it works only with the Github API right now. Bors is a valuable bot for us to manage PR, and increase safety.

Mirrors

Bhoat was also responsible to keep mirrors in-sync. Our mirrors are:

  • Github,
  • Gitlab,
  • Pikacode.

Gitlab has a mirror feature: Pulling regularly a remote. This is how Gitlab mirrors work since recently for all repositories (@ashgenesis did this, I let him confirm). Gitlab pulls from Github (mirrors pull from mirrors, huhu). Bhoat is no longer in charge of maintaining Gitlab as a mirror, it’s automatic.

Pikacode has a smilar feature. @ashgenesis: Is it the case for all repositories? Also, I personally don’t care if we drop Pikacode as a mirror platform. We did this because we had only one mirror at that time, but now there is Gitlab.

Github is the main mirror. Bhoat still needs to keep Github in-sync. But if we drop git.hoa-project.net and switch to Github, this problem is solved.

Note that Github is the source for our Packagist entries (because we think Github is more reliable than our servers, which has been true many times unfortunately).

Moving to Github

Github is a mirror for our Git repositories, but also for our configurations. Teams, SSH keys, GPG keys, permissions, organisations… I spend too much time to update both git.hoa-project.net & its sub-infra, and Github for any new modifications.

Yes, Github is close-source. Yes, Gitlab provides more features, like CI, and better Pages, and private repo. But our community of 2400+ stargazers live on Github. Hoa is on Github. I might start my new project on Gitlab, but it’s hard to move Hoa away from Github. Everything will be easier. And that’s the goal right now.

Thoughts?


#2

I confirm for Bors since we migrate to github we can use it and it will help us a lot to manage PR and increase safety.

Regarding Mirrors, all pikacode repositories are mirrored from git.hoa-project.net and only few of them on gitlab. The migration is not finish yet but can be continue from github.

I am with @Hywan regarding this vision of repositories and it will help us a lot to be focus on what is more important for the project.


#3

hi, very long topics with a lot of different subject inside.

ElasticSearch

What is it purpose?

git

For me, keeping or not multiple repository is not a real problem as long as it’s clear where they are and on which one we accept issues, PRs, etc.

Gitlab as mirror, can be a good solution to not “have all our eggs in the same basket”. It offer a lot of interesting feature that doesn’t exist anywhere.

Regarding, leaving github, I think it’s very bad idea. Github is very important in terms of image, perception, etc also called marketing! and for an opensource project it’s very important.

Central

Central is very interesting, at least as central point for our discussions. But as repository, it always make me perplex. For me playing with submodule or subtree can be a solution to have all the code in one repository. But except for a framework/tools(cms,erp,…), I don’t see the interest for us.

Installation flavor

Hoa can be installed with composer. We can always provide an installation with phar for some of our tools like the devtools with a tools like:

We could also do what jubianchi made for atoum (offline for now), having a tools that build the phar, sign it and make it available. We can also achieve the same goal without the need of hosting it through travis/gitalb-ci with some api from github/gitlab to store the signature and the phar.


#4

Actually no, it’s just about dropping Central and moving the origin remote from git.hoa-project.net to Github :-).

That’s not the purpose of this discussion. We already use ElasticSearch for the search on the site. Nothing new. I was summarizing.

We already have our repositories on Gitlab, as mirrors only.

I’m not sure we want to provide more source distribution vectors. The goal of the discussion is to simplify stuff, not to add them. If someone asks for PHAR, why not, but that’s the topic of this thread :wink: .


#5

I’m 100% with all the described changes. It’s always a good point to tackle complexity… If something is not relevant anymore and require time, we mustn’t have problem to drop it.

Actually, Github is the place where everything happens. Using the Github repos as the primary ones make sense. I’m ok about Central too, having this “mono”-repo does’nt make sens today.


#6

Github is a good choice. It makes no sense to me to not trust it.

About Hoa\Central, I’m not sure to understand why all libraries have not to be installed through composer too. It would be easier to maintain.

Symfony uses symfony/symfony package as a monolith as it’s done today with hoa/central. But we could consider having a hoa/central package with a composer.json which describes all dependencies.


#7

This is just not how our users install Hoa. Symfony is a framework, you might need everything if you’re lazy at some point, but it’s not the usage for Hoa.


#8

Agree at 100% So, Central will be involve as issue tracker only without any file commited I guess ?


#9

On one hand I find the Central issue tracker interesting : there is only one place to post, find and track them all. It is easier for the hoackers to track them. On the other hand, it can be a little messy as there are so many domains in Hoa that people who follow one subject will see issues of all subjects. See Symfony issue where there are so many issues from different part of the project… It is quite difficult to get into it.


#10

Totally agree with the proposal. However Central must have a custom README which explain his new goal.


#11

Exactly.

Central will not be the only place for all issues. It will continue as it is now, i.e. for issues that are “cross-libraries” (like RFC).


#12

Thank you for your replies. Everyone seems happy with this decision. Let’s move forward, it’s time to elaborate a plan.

Ensure Pikacode and Gitlab pull from Github

We must ensure that each library has an automatic mirror to Pikacode and Gitlab.

Install Bors

First, we need to install Bors on all our repositories. Once it’s done, we will no longer merge by hand. That’s the ultimate goal.

Disable Bhoat

To disable Bhoat, we have to replace its features.

bhoat library-into-central, aka clear Central

We must remove all the Hoa and Hoathis directories. We must clean the Extra directory but some part are used on the server-side: central.hoa-project.net. The /Resource/ API must be kept, but updated to remove the redirection to git.hoa-project.net.

bhoat irc, aka replace Bhoat notifications

Bhoat notifies us about Git repositories activities. I think Github can provide the same level of notifications. We have to configure all repositories by hand though…

bhoat archive, aka generate archives, tarballs…

Github already provides this. Nothing to do here.

bhoat data, aka activities for all repositories

I think it is possible to replace this with a GraphQL query that runs with a cron. I don’t think it is also really relevant to keep the “Activity" graph on the homepage (only place where these data are used), https://hoa-project.net/En/. What do you think?

bhoat data “simply” runs this command on Central:

$ git --git-dir=$working/.git log --all --date-order --since=2.month --format="%ct" \
    | sort -r \
    | xargs -n1 -I_ date -d "@_" "+%Y-%m-%d" \
    | uniq -c \
    | awk '{print $1,"\t",$2}' \
    | sed -e 's/ //g' >> $file

bhoat mirror, aka sync mirrors

Mirrors will pull automatically from Github, so nothing to do here.

Hooks

This discussion must be private. Bhoat have dozens of hooks fired by external events. We can’t discuss them publicly (yet?), but we have to address all of them. @ashgenesis, @pierozi, please ping me on IRC.

Deadline

Should we set a deadline? I think it’s a good idea. Who volunteers to help?


#13

I’ll be happy to help :smile:. Setting a deadline is a nice idea. I don’t realize how much time will be required to accomplish all these tasks… Maybe the end of march ?

To start here, I can help working on Gitlab / Pikacode to check mirrors, maybe I’ll need some permissions… What do you think @ashgenesis ?


#14

@shulard you should have enough right for now on pikacode to change mirrors url.


#15

Deal. Deadline is end of March. I’m adding a reminder in my calendar.


#16

I’ve migrated Pikacode mirrors URLs to use Github, and I’ll do the same for Gitlab before the end of the week.


#17

I guess almost everything is done now. We need to verify a last time to be sure everything is ok and start to use github as main repository.

What about our gitolite do we keep it for now as backup ?


#18

Can you list what have been done exactly :-)? Thanks!


#19

Maybe I was too enthusiast it’s not yet finish. With @shulard we have configured all repositories to be a mirror from github on Gitlab and Pikacode. So now we can deactivate on Bhoat the mirroring.

We have the others part of Bhoat to handle and Bors to install.


#20

15 days before the deadline for this project :-).