Non-committers can't keep authenticated forks updated

Open

Details

6 participants

45mg
Liliana Marie Prikler
Nicolas Graves
Ricardo Wurmus
Saturanya Rahjane de Lasca
Tomas Volf

Owner: unassigned

Submitted by: 45mg

Severity: normal

45mg wrote 4 days ago

Recipients:

Message-ID:878qrednyx.fsf@gmail.com

Hi Guix,

First of all, please spare me a few paragraphs to explain why I'm CC'ing

guix-devel on this bug report (I promise it's a good reason this time!).

Introduction

============

As has been mentioned many, MANY times on these lists, patch review is

erratic in Guix, and patches can be neglected for months. This can be

frustrating when you /need/ those patches on your own system (esp. when

you're patching anything under guix/, and not just adding or updating

packages). To avoid this frustration, as suggested by Felix Lechner [1],

one can fork Guix, apply patches in a separate branch, and `guix pull`

from that branch. Given that even a moderately active contributor will

probably have multiple open patches at any point in time, this needs to

work as a long-term arrangement - which means the fork needs to be kept

updated with upstream Guix.

Next, authentication. I'm sure I don't need to justify the need to

authenticate the code that our systems run on, especially since this

project has done so for years now. Especially for those of us that host

our forks remotely and pull over a network, pulling with

`--disable-authentication` becomes a security risk.

Now, finally, I can state the problem at hand - Unless you are

authorized to make authenticated commits into upstream Guix, /you cannot

keep an authenticated fork updated/. (Explanation to follow.)

This is a serious problem for anyone who's looking to become more active

in Guix; they must either give up security to use a fork, or wait months

before being able to benefit from their own work.

Explanation

===========

To the best of my knowledge, this problem was first mentioned on

Help-Guix by Tomas Volf [2], who patched `guix git authenticate` to work

around it [3]. Here's (a touched-up version of) the explanation of this

issue found in that patch's description:

Toggle snippet (23 lines)

When authenticating merge commits, intersection of authorized keys from

all parents is used. That is fine in Guix proper, since all involved

commits are under the control of the Guix committers, however it does

not work that well for authenticating merge commits in forks.

When Guix fork is created (starting from Guix-proper commit A), new

commit (authorizing the fork creator's signing key K) is created (I).

Later, when update from Guix proper (U) is merged, new merge commit is

created (M):

/ \

I U

\ /

The M is signed with the K. However since the K is allowed by only one

parent (I), it will not be in the set of authorized keys (intersection

of keys from I and U). So, commit M cannot be authenticated.

Thus, an authenticated fork cannot be kept updated.

A Prospective Solution

======================

I discovered Tomas's patch [3] more than a year later. I initially

wanted to contribute it upstream to solve this issue, but I discovered

that it leaves room for a rather serious attack [4]. So I had to rule it

out.

After some brainstorming, I thought of a solution. I'm paraphrasing the

relevant part of the mail in which I articulated it [5] below (this mail

should be a reply to that one):

Toggle snippet (59 lines)

I think I may have an idea myself; one that seems reasonably clean,

would fix our use-case of authenticating our own personal Guix forks,

and would even allow pulling branches from other people's forks and

authenticating those.

We could allow users to specify additional channel introductions. So,

there's always one primary introduction, but there can also be one or

more additional ones.

Commits with only one parent are authenticated normally.

For commits that have multiple parents - ie. merge commits - we weaken

the authorization invariant [6] as follows:

1. If all parents have the primary introduction as their most recent

ancestor, then the invariant holds as usual.

2. If one or more parents has the primary introduction as its most

recent ancestor (call these the 'primary parents'), and the rest have

any of the additional introductions, then the merge commit is

authenticated if and only if:

a) it's signed by a key authorized in all of the primary parents, AND

b) the /first parent/ [^] of the merge commit is a primary parent.

3. If all parents have the same additional introduction as their most

recent ancestor, then the invariant holds as usual.

4. If none of the parents have the primary introduction as their most

recent ancestor, nor do they have the same additional introduction,

then the merge commit cannot be authenticated.

[^] Quoting from the Pro Git book [7]:

> ...the first parent of a merge commit is from the branch you were on

> when you merged (frequently master), while the second parent of a

> merge commit is from the branch that was merged...

The idea is - the primary introduction is for the part of the tree under

YOUR control. When you fork Guix and create your own branch, you use the

initial commit on your branch as the primary channel introduction. You

add upstream Guix's primary channel introduction as an additional

channel introduction. If you add anyone else's fork as a remote and pull

one of their branches, you add their primary introduction as one of your

additional introductions.

Thus, any merge into one of YOUR branches (ie. any branch with the

primary introduction as the most recent ancestor) only needs to be

signed by a key that's authorized on that branch.

But you can't merge into a branch from upstream Guix or someone else's

authenticated fork (unless you're authorized to commit to those),

because the first parent of the merge commit would not be a primary

parent (see 2b) - it would be a commit on someone else's branch. And

people not authorized by you can't merge into your branch either,

because of 2a. And finally, you can't merge someone else's fork and

upstream, or anything like that. The merge commit would not be

authenticated in any of these cases.

So What Do You Want From Me Anyway, 45mg?
========================================

I've tried to think of ways in which this modification to the behaviour
of `guix git authenticate` could compromise security, but so far I
haven't been able to think of any attacks it might enable.

Of course, this only means that /I/ haven't been able to think of
anything wrong. You, dear reader, have the advantage of a unique
perspective and a fresh view on this idea. So, I'm hoping that you'll
be able to sniff out any fundamental issues with the design here.

I've actually started work on a patch series to implement this, but it's
going to be pretty slow going - I've spent several hours on it so far,
and I'm maybe a fifth of the way done. (Obviously, I'm going to have to
pace myself more; so don't hold your breath.)

In the meantime, if there's a fundamental problem with the approach I've
described, I hope you will be able to find it sooner rather than later,
before I sink even more time and energy into this endeavor.

Thanks for reading this far, and here's hoping we can achieve a better
experience for budding contributors!

45mg

[1] https://lists.gnu.org/archive/html/guix-devel/2025-01/msg00072.html
[2] https://lists.gnu.org/archive/html/help-guix/2023-09/msg00078.html
[3] https://git.wolfsden.cz/guix/tree/etc/0001-git-authenticate-Trust-all-keys-from-already-authent.patch
[4] https://lists.gnu.org/archive/html/help-guix/2025-01/msg00097.html
[5] https://lists.gnu.org/archive/html/help-guix/2025-01/msg00101.html
[6] From the 'Securing Updates' Guix blog post:
    > A commit is considered authentic if and only if it is signed by
    > one of the keys listed in the .guix-authorizations file of each of
    > its parents. This is the authorization invariant.
    https://guix.gnu.org/en/blog/2020/securing-updates/
[7] https://git-scm.com/book/en/v2/Git-Tools-Revision-Selection

Liliana Marie Prikler wrote 3 days ago

Recipients:

Message-ID:a0bd47239c7b8443e26e75c858a896db3d00c987.camel@gmail.com

Am Dienstag, dem 14.01.2025 um 04:21 +0000 schrieb 45mg:

Toggle quote (25 lines)> --8<---------------cut here---------------start------------->8---
> When authenticating merge commits, intersection of authorized keys
> from all parents is used. That is fine in Guix proper, since all
> involved commits are under the control of the Guix committers,
> however it does not work that well for authenticating merge commits
> in forks.
> 
> When Guix fork is created (starting from Guix-proper commit A), new
> commit (authorizing the fork creator's signing key K) is created (I).
> Later, when update from Guix proper (U) is merged, new merge commit
> is created (M):
> 
> ���� M
> ��� / \
> �� I�� U
> ��� \ /
> ���� A
> 
> The M is signed with the K. However since the K is allowed by only
> one parent (I), it will not be in the set of authorized keys
> (intersection of keys from I and U). So, commit M cannot be
> authenticated.
> 
> Thus, an authenticated fork cannot be kept updated.
> --8<---------------cut here---------------end--------------->8---

For most use cases, this is a non-issue.  Assuming you are a single
committer to your fork, you can always rebase your changes on top of
Guix (if you're willing to bump the introductory commit) or sign the
changes to Guix with your own key (if you are willing to accept that
this changes the history).  With multiple committers, you will need to
do the latter.  Of course, you can also keep your own fork
unauthenticated, which might be preferable if you only do local work
anyway, but that's besides the issue here.

Toggle quote (15 lines)> […]
> 
> For commits that have multiple parents - ie. merge commits - we
> weaken the authorization invariant [6] as follows:
> 
> 1. If all parents have the primary introduction as their most recent
> �� ancestor, then the invariant holds as usual.
> �� 
> 2. If one or more parents has the primary introduction as its most
> �� recent ancestor (call these the 'primary parents'), and the rest 
>    have any of the additional introductions, then the merge commit is
> �� authenticated if and only if:
> �� a) it's signed by a key authorized in all of the primary parents,
>    AND
> �� b) the /first parent/ [^] of the merge commit is a primary parent.

This does not state how the additional introductions are used, if at
all.  It may mean that the additional introductions are pointless other
than for blocking case 4.
�� 

Toggle quote (39 lines)> 3. If all parents have the same additional introduction as their most
> �� recent ancestor, then the invariant holds as usual.
> 
> 4. If none of the parents have the primary introduction as their most
> �� recent ancestor, nor do they have the same additional
>    introduction, then the merge commit cannot be authenticated.

> The idea is - the primary introduction is for the part of the tree
> under YOUR control. When you fork Guix and create your own branch,
> you use the initial commit on your branch as the primary channel
> introduction.  You add upstream Guix's primary channel introduction
> as an additional channel introduction. If you add anyone else's fork
> as a remote and pull one of their branches, you add their primary
> introduction as one of your additional introductions.
> 
> Thus, any merge into one of YOUR branches (ie. any branch with the
> primary introduction as the most recent ancestor) only needs to be
> signed by a key that's authorized on that branch.
> 
> But you can't merge into a branch from upstream Guix or someone
> else's authenticated fork (unless you're authorized to commit to
> those), because the first parent of the merge commit would not be a
> primary parent (see 2b) - it would be a commit on someone else's
> branch. And people not authorized by you can't merge into your branch
> either, because of 2a. And finally, you can't merge someone else's
> fork and upstream, or anything like that. The merge commit would not
> be authenticated in any of these cases.

> So What Do You Want From Me Anyway, 45mg?
> ========================================
> 
> I've tried to think of ways in which this modification to the
> behaviour of `guix git authenticate` could compromise security, but
> so far I haven't been able to think of any attacks it might enable.
> 
> Of course, this only means that /I/ haven't been able to think of
> anything wrong. You, dear reader, have the advantage of a unique
> perspective and a fresh view on this idea. So, I'm hoping that you'll
> be able to sniff out any fundamental issues with the design here.

I think this might still hide a serious flaw.  With the way *upstream*
authentication works.  Let's flip the example in [6] around a little
bit and construct the following:

-A---B---C---D
  \       \
   \       \-E---F---?
    \               /
     \----G--H--I*-/
  
Both A and I* are introductory commits on their various branches.  In
?, any committer who has valid keys in both F and I* can merge a
branch with unsigned commits, effectively voiding the invariant of
BCEF, e.g. by undoing any changes that happened there.  Of course, they
can do so with signed commits as well, given that they have commit
access to the main repository, but the point still holds that they may
introduce unsigned commits to any fork where their key is valid in.

Cheers

45mg wrote 3 days ago

Recipients:

Message-ID:87h660xeld.fsf@gmail.com

Hi Liliana!

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (4 lines)

> For most use cases, this is a non-issue. Assuming you are a single

> committer to your fork, you can always rebase your changes on top of

> Guix (if you're willing to bump the introductory commit)

The idea of authentication is that once you trust the channel
introduction, you can be sure that everything you pull after that is
authentic. The introduction only needs to be trusted once. If you're
bumping the introduction every time, then you need to obtain and verify
the introduction every time. You're going from 'Trust On First Use' to
'Trust On Every Use'. Not ideal IMO.

Toggle quote (4 lines)

> or sign the changes to Guix with your own key (if you are willing to

> accept that this changes the history). With multiple committers, you

> will need to do the latter.

While this could actually work, without changing the history, the
problem is that there is no easy way to authenticate upstream commits.

You could do it like this:
0) Before creating your fork, authenticate every commit in the Guix
   checkout (as described in the manual).
1) Switch to your branch that tracks upstream.
2) Pull from upstream.
3) Run `guix git authenticate`, supplying Guix's channel introduction as
   arguments.
4) After this succeeds, create and switch to a branch from the current
   tip of your upstream-tracking branch. Edit .guix_authorizations to
   add your key, and create a signed commit.
5) Merge this branch into your fork branch.
6) Switch back to your fork branch.
7) Delete the [guix "authentication"] section from .git/config.
8) Run `guix git authenticate` with the introduction of your fork
   branch, to authenticate the merge commit.

That's a lot of manual steps for every pull from upstream! While I do
have to give you credit for this idea - at least we now have a
workaround for people who are determined enough - I'm guessing a lot of
people will probably just skip authentication if it's going to be this
annoying. Authenticating a fresh clone from scratch will be even more
annoying, especially if you have multiple fork branches (eg. you're
tracking someone else's fork).

We could create a script to do all the steps for us, but if and when it
fails on whatever insane edge cases people are able to come up with,
they're going to need to understand all the steps involved anyway.
Abstraction is not a substitute for a clean underlying design.

Also I just want to point out that rebasing /will/ change the history.
The `guix pull` after every time you update your fork will need to be a
force-pull (--allow-downgrades [1]).

Toggle quote (4 lines)

> Of course, you can also keep your own fork unauthenticated, which

> might be preferable if you only do local work anyway, but that's

> besides the issue here.

Yes, to be clear, I'm talking about the use-case where your fork is
hosted remotely, and you or someone else needs to pull changes from it.
For example, my prospective use case would be quickly bootstrapping Guix
on a new machine - I build my own installation image, and I'd want it to
pull from my fork. I can include my introduction into my installer, just
like the official one. But if the introduction changes before I use my
installer, then the first pull can't be authenticated.

Toggle quote (4 lines)

> This does not state how the additional introductions are used, if at

> all. It may mean that the additional introductions are pointless other

> than for blocking case 4.

My bad, I guess I forgot to explain that.

The purpose of the additional introductions is to make it so that signed

commits from upstream Guix, or commits from other people's forks, can

still be authenticated. As I mentioned above, the current design is not

suited to this.

To go a bit more into detail - we will accomplish authentication by

doing a postorder traversal of the commit tree, considering the latest

commit as the root node. We traverse its parents recursively until we

reach a commit whose parent is one of the channel introductions (primary

or additional). Then that commit and all its children are authenticated

from the introduction that we encountered. In this way, every commit is

authenticated from the introduction that is its most recent ancestor.

Toggle quote (18 lines)> I think this might still hide a serious flaw.  With the way *upstream*
> authentication works.  Let's flip the example in [6] around a little
> bit and construct the following:
>
> -A---B---C---D
>   \       \
>    \       \-E---F---?
>     \               /
>      \----G--H--I*-/
>   
> Both A and I* are introductory commits on their various branches.  In
> ?, any committer who has valid keys in both F and I* can merge a
> branch with unsigned commits, effectively voiding the invariant of
> BCEF, e.g. by undoing any changes that happened there.  Of course, they
> can do so with signed commits as well, given that they have commit
> access to the main repository, but the point still holds that they may
> introduce unsigned commits to any fork where their key is valid in.

So, my design enables an attacker who can make authorized signed commits

to also introduce changes made in unsigned commits. Hmm.

I don't think this compromises our current security guarantees, though?

If the attacker can already make trusted commits, then any attack they

can perform in the way you described can also be done directly with

signed commits onto F, as you pointed out. And the latter way would be

far simpler for them.

Also, the branch they merged into would not contain any unsigned

commits; the commit '?' is still signed with a key authorized for F's

branch. So at most, we can say that the attacker can introduce /changes

made in/ unsigned commits, not 'introduce unsigned commits'.

Once you manage to revoke their commit access, you'd just revert the

'?' commit and delete the GHI branch (which is the one that contains

unsigned commits). The same way you'd recover from them directly making

malicious changes on master.

[1] https://issues.guix.gnu.org/41604#16-lineno14

Liliana Marie Prikler wrote 2 days ago

Recipients:

Message-ID:cdc27d94591d7865b034b5057ce7ad25b930921c.camel@gmail.com

Hi,

Am Mittwoch, dem 15.01.2025 um 15:48 +0000 schrieb 45mg:

Toggle quote (6 lines)> The idea of authentication is that once you trust the channel
> introduction, you can be sure that everything you pull after that is
> authentic. The introduction only needs to be trusted once. If you're
> bumping the introduction every time, then you need to obtain and
> verify the introduction every time. You're going from 'Trust On First
> Use' to 'Trust On Every Use'. Not ideal IMO.

Let's recall that the entity you need to trust is still yourself in

most of those cases.

Toggle quote (25 lines)> You could do it like this:
> 0) Before creating your fork, authenticate every commit in the Guix
> �� checkout (as described in the manual).
> 1) Switch to your branch that tracks upstream.
> 2) Pull from upstream.
> 3) Run `guix git authenticate`, supplying Guix's channel introduction
> as
> �� arguments.
> 4) After this succeeds, create and switch to a branch from the
> current
> �� tip of your upstream-tracking branch. Edit .guix_authorizations to
> �� add your key, and create a signed commit.
> 5) Merge this branch into your fork branch.
> 6) Switch back to your fork branch.
> 7) Delete the [guix "authentication"] section from .git/config.
> 8) Run `guix git authenticate` with the introduction of your fork
> �� branch, to authenticate the merge commit.
> 
> That's a lot of manual steps for every pull from upstream! While I do
> have to give you credit for this idea - at least we now have a
> workaround for people who are determined enough - I'm guessing a lot
> of people will probably just skip authentication if it's going to be
> this annoying. Authenticating a fresh clone from scratch will be even
> more annoying, especially if you have multiple fork branches (eg.
> you're tracking someone else's fork).

I think you're making this more complicated than it needs to be.

checkout, authenticate, rebase*, merge*�ought to have you covered.

* you can authenticate after these if you're paranoid

Toggle quote (10 lines)> We could create a script to do all the steps for us, but if and when
> it fails on whatever insane edge cases people are able to come up
> with, they're going to need to understand all the steps involved
> anyway. Abstraction is not a substitute for a clean underlying
> design.
> 
> Also I just want to point out that rebasing /will/ change the
> history.
> The `guix pull` after every time you update your fork will need to be
> a force-pull (--allow-downgrades [1]).

No, it wouldn't. You would rebase those changes on top of what you

already have on those respective branches.

Toggle quote (12 lines)> > Of course, you can also keep your own fork unauthenticated, which
> > might be preferable if you only do local work anyway, but that's
> > besides the issue here.
> 
> Yes, to be clear, I'm talking about the use-case where your fork is
> hosted remotely, and you or someone else needs to pull changes from
> it.  For example, my prospective use case would be quickly
> bootstrapping Guix on a new machine - I build my own installation
> image, and I'd want it to pull from my fork. I can include my
> introduction into my installer, just like the official one. But if
> the introduction changes before I use my installer, then the first
> pull can't be authenticated.

I don't see why in your particular use case you can not use a channel
on top of Guix rather than replicating Guix itself.  Now there might be
some weird edge case I'm overlooking where you cut deep into the
dependency graph and that makes sense, but I sure hope that's a rare
edge case in and of itself.

Toggle quote (19 lines)> > This does not state how the additional introductions are used, if
> > at all.� It may mean that the additional introductions are
> > pointless other than for blocking case 4.
> 
> My bad, I guess I forgot to explain that.
> 
> The purpose of the additional introductions is to make it so that
> signed commits from upstream Guix, or commits from other people's
> forks, can still be authenticated. As I mentioned above, the current
> design is not suited to this.
> 
> To go a bit more into detail - we will accomplish authentication by
> doing a postorder traversal of the commit tree, considering the
> latest commit as the root node. We traverse its parents recursively
> until we reach a commit whose parent is one of the channel
> introductions (primary or additional). Then that commit and all its
> children are authenticated from the introduction that we encountered.
> In this way, every commit is authenticated from the introduction that
> is its most recent ancestor.

Yeah, I think this scheme will still end up in [4].  As pointed out in
[8], "primary" is just a convention that we can't rely on.  So let's
just talk about the idea of widening one channel introduction to any
number of channel introductions – we can always store a mapping of HEAD
→ first authenticated commit and then assert that this set is a subset
of what we declare as introductions. �(This mapping will also make
authentication as efficient as it currently is, since we don't need to
reauthenticate everything all the time.)

Is this good enough?  No: an attacker could easily add their own
introduction and call it a day.  In fact, this scheme is even worse
than what was exploited in [4], because they never need commit access
to the Guix repo to do so.  Ahh, but wait!  `guix pull` on the user's
side uses their clean set of channels for authentication.  Those only
have upstream Guix… unless you actually pull your own fork or manage an
attack as outlined below (in which case you do need commit access for
some amount of time).

Toggle quote (24 lines)> > I think this might still hide a serious flaw.� With the way
> > *upstream* authentication works.� Let's flip the example in [6]
> > around a little bit and construct the following:
> > 
> > -A---B---C---D
> > � \������ \
> > �� \������ \-E---F---?
> > ��� \�������������� /
> > ���� \----G--H--I*-/
> > � 
> > Both A and I* are introductory commits on their various branches.�
> > In ?, any committer who has valid keys in both F and I* can merge
> > a branch with unsigned commits, effectively voiding the invariant
> > of BCEF, e.g. by undoing any changes that happened there.� Of
> > course, they can do so with signed commits as well, given that they
> > have commit access to the main repository, but the point still
> > holds that they may introduce unsigned commits to any fork where
> > their key is valid in.
> 
> So, my design enables an attacker who can make authorized signed
> commits to also introduce changes made in unsigned commits. Hmm.
> 
> I don't think this compromises our current security guarantees,
> though?

I mean, the promise we do make is that all commits starting from a

certain commit are signed. So IMHO, this effectively breaks that :)

Toggle quote (4 lines)

> If the attacker can already make trusted commits, then any attack

> they can perform in the way you described can also be done directly

> with signed commits onto F, as you pointed out. And the latter way

> would be far simpler for them.

Simpler, yes, but less stealthy. Most contributors don't concern

themselves with the specifics of any particular branch, and you may

even be able to dress up your evil branch as a good branch until the

point where you finally merge it.

Toggle quote (4 lines)

> Also, the branch they merged into would not contain any unsigned

> commits; the commit '?' is still signed with a key authorized for

> F's branch. So at most, we can say that the attacker can introduce

> /changes made in/ unsigned commits, not 'introduce unsigned commits'.

They can make an arbitrary number of unsigned commits before needing to
sign off one commit that will be merged.  If they follow the style of
merging master into their branch and then their branch into master,
said commit can even be empty, though that would no longer be stealthy.
Now if they were to I don't know, bump 9000 Rust packages or something
like that, they have a lot of space to exploit the as-of-yet in this
manner unexploited, but still weak SHA-1 hashes Git uses.

Toggle quote (4 lines)

> Once you manage to revoke their commit access, you'd just revert the

> '?' commit and delete the GHI branch (which is the one that contains

> unsigned commits). The same way you'd recover from them directly

> making malicious changes on master.

Reverting this change could land you in early 2025. And worse, your

attacker could lure you onto their branch if you happen to land on any

bad commit in the meantime.

Cheers

[8] https://lists.gnu.org/archive/html/help-guix/2025-01/msg00116.html

Tomas Volf wrote 2 days ago

Recipients:(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)

Message-ID:87a5brsmve.fsf@wolfsden.cz

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (18 lines)> I think you're making this more complicated than it needs to be. 
> checkout, authenticate, rebase*, merge*�ought to have you covered.
>
> * you can authenticate after these if you're paranoid 
>
>> We could create a script to do all the steps for us, but if and when
>> it fails on whatever insane edge cases people are able to come up
>> with, they're going to need to understand all the steps involved
>> anyway. Abstraction is not a substitute for a clean underlying
>> design.
>> 
>> Also I just want to point out that rebasing /will/ change the
>> history.
>> The `guix pull` after every time you update your fork will need to be
>> a force-pull (--allow-downgrades [1]).
> No, it wouldn't.  You would rebase those changes on top of what you
> already have on those respective branches.

This has the slight issue that I can no longer easily answer a question

"is this commit in my fork", since I cannot search by the commit hash.

I admit it is not a question I need to answer often (last time was on

21st of October, CVE-2024-52867).

And merging also (and this is more interesting property) ensures that

*all* official commits are always present in my repository on the master

branch. So I can just use guix time-machine --commit without always

forgetting `-q' argument and having to do it second time.

I feel like the merging is a superior workflow for long-lived soft-fork,

expect the (here debated) issue with authentication.

Toggle quote (14 lines)>> Yes, to be clear, I'm talking about the use-case where your fork is
>> hosted remotely, and you or someone else needs to pull changes from
>> it.  For example, my prospective use case would be quickly
>> bootstrapping Guix on a new machine - I build my own installation
>> image, and I'd want it to pull from my fork. I can include my
>> introduction into my installer, just like the official one. But if
>> the introduction changes before I use my installer, then the first
>> pull can't be authenticated.
> I don't see why in your particular use case you can not use a channel
> on top of Guix rather than replicating Guix itself.  Now there might be
> some weird edge case I'm overlooking where you cut deep into the
> dependency graph and that makes sense, but I sure hope that's a rare
> edge case in and of itself.

As long as the changes are limited to packages, it is (mostly) fine, you

can get very far with (inherit) and various transformations.

However changes outside of that are not that rare. Few examples follow.

Anything modifying services is a problem. As far as I know, there is no

way to modify a service the way you can do with a package.

I carry a modification to nftables-configuration which adds (tables)

field so that I can do:

Toggle snippet (13 lines)(service
 nftables-service-type
 (nftables-configuration
  (tables
   (modify-nftables-tables %default-nftables-tables
     (mod 'inet
          (mod 'input
               (rep 'allow-ssh
                    (if (and sshd-port open-sshd)
                        (allow-dport-snippet "tcp" sshd-port)
                        #f))))))))

It allows me to construct the firewall gradually, however I have not yet

decided whether I like the API or not (leaning towards "no"), so I did

not sent it upstream.

#71981 is open since July 7 and I am not aware of a way to work around

package->symbol deficiencies from a channel.

Then there is anything modifying any of the guix commands. #74832 is

over month old, and as far as I know, I am not able to fix guix-copy

from a channel. #72928 took over a month to merge, and again, not sure

how to patch guix-describe from a channel.

So, while I fully agree that package modifications *are* possible with

channel and are more common type of patch to carry, things that are

*not* possible with channel are (at least in my tree), while not as

frequent, definitely not a "rare edge case".

(Yes, I am aware I can just copy&paste the service code into my channel.

But at that point I am again just "replicating Guix", just by more

manual and error-prone means. And even for packages, adjusting system

configuration to use package from my channel, getting it merged and then

adjusting back to upstream is annoying chore.)

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Liliana Marie Prikler wrote 46 hours ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)

Message-ID:c3b1445d236ed95f768c218503089926788256a4.camel@gmail.com

Am Donnerstag, dem 16.01.2025 um 00:01 +0100 schrieb Tomas Volf:

Toggle quote (23 lines)> Liliana Marie Prikler <liliana.prikler@gmail.com> writes:
> 
> > I think you're making this more complicated than it needs to be. 
> > checkout, authenticate, rebase*, merge*�ought to have you covered.
> > 
> > * you can authenticate after these if you're paranoid 
> > 
> > > We could create a script to do all the steps for us, but if and
> > > when it fails on whatever insane edge cases people are able to
> > > come up with, they're going to need to understand all the steps
> > > involved anyway. Abstraction is not a substitute for a clean
> > > underlying design.
> > > 
> > > Also I just want to point out that rebasing /will/ change the
> > > history.  The `guix pull` after every time you update your fork
> > > will need to be a force-pull (--allow-downgrades [1]).
> > No, it wouldn't.� You would rebase those changes on top of what you
> > already have on those respective branches.
> 
> This has the slight issue that I can no longer easily answer a
> question "is this commit in my fork", since I cannot search by the
> commit hash. I admit it is not a question I need to answer often
> (last time was on 21st of October, CVE-2024-52867).

You could solve this by embedding an "upstream-commit:" trailer, but

that is an admittedly cursed transformation that no longer maps to a

single rebase, I admit.

Toggle quote (4 lines)

> And merging also (and this is more interesting property) ensures that

> *all* official commits are always present in my repository on the

> master branch.� So I can just use guix time-machine --commit without

> always forgetting `-q' argument and having to do it second time.

In my personal experience, time-machine breaks with third-party

channels all the time, so `-q` is probably good advice anyway. But

yeah, that's a valid concern.

Toggle quote (46 lines)> I feel like the merging is a superior workflow for long-lived soft-
> fork, expect the (here debated) issue with authentication.
> 
> > > Yes, to be clear, I'm talking about the use-case where your fork
> > > is hosted remotely, and you or someone else needs to pull changes
> > > from it.� For example, my prospective use case would be quickly
> > > bootstrapping Guix on a new machine - I build my own installation
> > > image, and I'd want it to pull from my fork. I can include my
> > > introduction into my installer, just like the official one. But
> > > if the introduction changes before I use my installer, then the
> > > first pull can't be authenticated.
> > I don't see why in your particular use case you can not use a
> > channel on top of Guix rather than replicating Guix itself.� Now
> > there might be some weird edge case I'm overlooking where you cut
> > deep into the dependency graph and that makes sense, but I sure
> > hope that's a rare edge case in and of itself.
> 
> As long as the changes are limited to packages, it is (mostly) fine,
> you can get very far with (inherit) and various transformations.
> 
> However changes outside of that are not that rare.� Few examples
> follow.
> 
> Anything modifying services is a problem.� As far as I know, there is
> no way to modify a service the way you can do with a package.
> 
> I carry a modification to nftables-configuration which adds (tables)
> field so that I can do:
> 
> --8<---------------cut here---------------start------------->8---
> (service
> �nftables-service-type
> �(nftables-configuration
> � (tables
> �� (modify-nftables-tables %default-nftables-tables
> ���� (mod 'inet
> ��������� (mod 'input
> �������������� (rep 'allow-ssh
> ������������������� (if (and sshd-port open-sshd)
> ����������������������� (allow-dport-snippet "tcp" sshd-port)
> ����������������������� #f))))))))
> --8<---------------cut here---------------end--------------->8---
> 
> It allows me to construct the firewall gradually, however I have not
> yet decided whether I like the API or not (leaning towards "no"), so
> I did not sent it upstream.

You can roll your own service definitions, but it does become harder

when you want to keep all changes to that service from master as well.

But `(use-modules (my-channel services nftables))` should pull that

nftables code :)

Toggle quote (2 lines)

> #71981 is open since July 7 and I am not aware of a way to work

> around package->symbol deficiencies from a channel.

I mean, the right thing would be to address #71979, but I don't see

that being done either. Ludo's fix is merely lexicographic and thus

breaks with directories that aren't simply ".".

Toggle quote (4 lines)

> Then there is anything modifying any of the guix commands.� #74832 is

> over month old, and as far as I know, I am not able to fix guix-copy

> from a channel.� #72928 took over a month to merge, and again, not

> sure how to patch guix-describe from a channel.

Have you considered extensions?

Toggle quote (6 lines)> (Yes, I am aware I can just copy&paste the service code into my
> channel.  But at that point I am again just "replicating Guix", just
> by more manual and error-prone means.� And even for packages,
> adjusting system configuration to use package from my channel,
> getting it merged and then adjusting back to upstream is annoying
> chore.)

You could code your channel in a way that it serves upstream stuff

either silently or with a deprecation warning if a particular package

is requested. Not a channel, but [1] illustrates my point.

Cheers

[1] https://git.ist.tugraz.at/clingabomino/clingabomino/-/blob/0.2.0/pkg/guix.scm?ref_type=tags#L30

Ricardo Wurmus wrote 45 hours ago

Recipients:(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)

Message-ID:87frlji1j4.fsf@elephly.net

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (8 lines)

>> This has the slight issue that I can no longer easily answer a

>> question "is this commit in my fork", since I cannot search by the

>> commit hash. I admit it is not a question I need to answer often

>> (last time was on 21st of October, CVE-2024-52867).

> You could solve this by embedding an "upstream-commit:" trailer, but

> that is an admittedly cursed transformation that no longer maps to a

> single rebase, I admit.

You can use the Change-Id tag for this. Our local hooks create them and

they stay intact after rebase.

Ricardo

45mg wrote 40 hours ago

Recipients:

Message-ID:87v7uerjk9.fsf@gmail.com

Hi Liliana,

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (12 lines)> Hi,
>
> Am Mittwoch, dem 15.01.2025 um 15:48 +0000 schrieb 45mg:
>> The idea of authentication is that once you trust the channel
>> introduction, you can be sure that everything you pull after that is
>> authentic. The introduction only needs to be trusted once. If you're
>> bumping the introduction every time, then you need to obtain and
>> verify the introduction every time. You're going from 'Trust On First
>> Use' to 'Trust On Every Use'. Not ideal IMO.
> Let's recall that the entity you need to trust is still yourself in
> most of those cases.  

If you host your repo unauthenticated on a server, you need to fully
trust the server, as well as the connection between you and the server.
Regarding the former, none of the most popular ways to host a git repo
(eg. GitHub, Codeberg, your own forge instance on a VPS) allow you to
know much about the underlying server, so you can't really assume it to
be secure. The latter is a ridiculously complicated topic that I'm not
qualified to go into. To avoid trusting all these intermediaries more
than once if at all, we have authentication.

I realise it may seem silly to worry about your own little fork being
directly targeted in ways like this, but the main reason I chose Guix in
the first place is the focus on getting the fundamentals right -
reproducibility, bootstrappability, free software, etc. - even though
most projects don't put in as much effort towards them, and even though
a lot of users may not be directly affected by these things. I think
security is one such thing. As the 'Authenticate Your Git Checkouts'
blog post [9] pointed out, we wouldn't need `guix git authenticate` if
we were willing to delegate our security to a trusted third party, like
all the open source projects that sport those "fancy “? verified”
badges as found on GitLab and on GitHub" do. We shouldn't force anyone
hosting a fork to do so as well.

Toggle quote (30 lines)>> You could do it like this:
>> 0) Before creating your fork, authenticate every commit in the Guix
>> �� checkout (as described in the manual).
>> 1) Switch to your branch that tracks upstream.
>> 2) Pull from upstream.
>> 3) Run `guix git authenticate`, supplying Guix's channel introduction
>> as
>> �� arguments.
>> 4) After this succeeds, create and switch to a branch from the
>> current
>> �� tip of your upstream-tracking branch. Edit .guix_authorizations to
>> �� add your key, and create a signed commit.
>> 5) Merge this branch into your fork branch.
>> 6) Switch back to your fork branch.
>> 7) Delete the [guix "authentication"] section from .git/config.
>> 8) Run `guix git authenticate` with the introduction of your fork
>> �� branch, to authenticate the merge commit.
>> 
>> That's a lot of manual steps for every pull from upstream! While I do
>> have to give you credit for this idea - at least we now have a
>> workaround for people who are determined enough - I'm guessing a lot
>> of people will probably just skip authentication if it's going to be
>> this annoying. Authenticating a fresh clone from scratch will be even
>> more annoying, especially if you have multiple fork branches (eg.
>> you're tracking someone else's fork).
> I think you're making this more complicated than it needs to be. 
> checkout, authenticate, rebase*, merge*�ought to have you covered.
>
> * you can authenticate after these if you're paranoid 

The complexity is due to the requirements of not bumping the channel
introduction (to avoid the increased attack surface from having to keep
obtaining the updated one, as I discussed earlier), keeping fork history
intact (to avoid force pulls), keeping upstream history intact, and
being able to authenticate both upstream and fork commits. If you can
think of a simpler method that meets these requirements, I'd love to
hear it.

Also, I just realised that this one won't even work. The commit created
in step 4 cannot be authenticated, as it's signed with your key, which
is not in its parent's .guix_authorizations.

Toggle quote (13 lines)>> We could create a script to do all the steps for us, but if and when
>> it fails on whatever insane edge cases people are able to come up
>> with, they're going to need to understand all the steps involved
>> anyway. Abstraction is not a substitute for a clean underlying
>> design.
>> 
>> Also I just want to point out that rebasing /will/ change the
>> history.
>> The `guix pull` after every time you update your fork will need to be
>> a force-pull (--allow-downgrades [1]).
> No, it wouldn't.  You would rebase those changes on top of what you
> already have on those respective branches.

It looks like at least one of us is misunderstanding rebasing. Could be
me, so I'm consulting the relevant chapter from the Pro Git book [11]
for a refresher.

Let's imagine that the first example given there represents our fork of
Guix, where the 'experiment' branch marks the beginning of our fork (and
its channel introduction) and the 'master' branch tracks upstream Guix.
After `git rebase master`, the commit that used to be C4 is gone, and
now C4' takes its place. It may contain the same changes, but it's a
different commit - so it (and any commits that it's the parent of) has a
different hash. So the channel introduction has changed, and so has the
entire history of the `experimental` branch. So we need to force-pull.

Toggle quote (18 lines)>> > Of course, you can also keep your own fork unauthenticated, which
>> > might be preferable if you only do local work anyway, but that's
>> > besides the issue here.
>> 
>> Yes, to be clear, I'm talking about the use-case where your fork is
>> hosted remotely, and you or someone else needs to pull changes from
>> it.  For example, my prospective use case would be quickly
>> bootstrapping Guix on a new machine - I build my own installation
>> image, and I'd want it to pull from my fork. I can include my
>> introduction into my installer, just like the official one. But if
>> the introduction changes before I use my installer, then the first
>> pull can't be authenticated.
> I don't see why in your particular use case you can not use a channel
> on top of Guix rather than replicating Guix itself.  Now there might be
> some weird edge case I'm overlooking where you cut deep into the
> dependency graph and that makes sense, but I sure hope that's a rare
> edge case in and of itself.

See Tomas's reply [10]. I'll continue this particular tangent in that

sub-thread.

Toggle quote (31 lines)>> The purpose of the additional introductions is to make it so that
>> signed commits from upstream Guix, or commits from other people's
>> forks, can still be authenticated. As I mentioned above, the current
>> design is not suited to this.
>> 
>> To go a bit more into detail - we will accomplish authentication by
>> doing a postorder traversal of the commit tree, considering the
>> latest commit as the root node. We traverse its parents recursively
>> until we reach a commit whose parent is one of the channel
>> introductions (primary or additional). Then that commit and all its
>> children are authenticated from the introduction that we encountered.
>> In this way, every commit is authenticated from the introduction that
>> is its most recent ancestor.
> Yeah, I think this scheme will still end up in [4].  As pointed out in
> [8], "primary" is just a convention that we can't rely on.  So let's
> just talk about the idea of widening one channel introduction to any
> number of channel introductions – we can always store a mapping of HEAD
> → first authenticated commit and then assert that this set is a subset
> of what we declare as introductions. �(This mapping will also make
> authentication as efficient as it currently is, since we don't need to
> reauthenticate everything all the time.)
>
> Is this good enough?  No: an attacker could easily add their own
> introduction and call it a day.  In fact, this scheme is even worse
> than what was exploited in [4], because they never need commit access
> to the Guix repo to do so.  Ahh, but wait!  `guix pull` on the user's
> side uses their clean set of channels for authentication.  Those only
> have upstream Guix… unless you actually pull your own fork or manage an
> attack as outlined below (in which case you do need commit access for
> some amount of time).

Whew. Ok, before I can reply directly to this, I need to discuss a few

related things.

First of all, let's talk about [8]. It isn't part of this thread so I'll

quote the relevant part here:

Toggle quote (28 lines)> Problem here is that this (which parent is first) is just a convention
> that the attacker does not have to follow.  Example:
> 
> --8<---------------cut here---------------start------------->8---
> /tmp/xx $ git commit-tree -p HEAD -p HEAD~1 -m M HEAD^{tree}
> c040e61bc184b5971f68c4b794c3158350b5d5e9
> /tmp/xx $ g show c040e61bc184b5971f68c4b794c3158350b5d5e9
> commit c040e61bc184b5971f68c4b794c3158350b5d5e9
> Merge: 40ef875 17451b8
> Author: Tomas Volf <~@wolfsden.cz>
> Date:   Tue Jan 14 23:12:17 2025 +0100
> 
>     M
> 
> /tmp/xx $ git commit-tree -p HEAD~1 -p HEAD -m M HEAD^{tree}
> ec74e368519b667d8d280639db6642b28d37eb53
> /tmp/xx $ g show ec74e368519b667d8d280639db6642b28d37eb53
> commit ec74e368519b667d8d280639db6642b28d37eb53
> Merge: 17451b8 40ef875
> Author: Tomas Volf <~@wolfsden.cz>
> Date:   Tue Jan 14 23:12:32 2025 +0100
> 
>     M
> --8<---------------cut here---------------end--------------->8---
> 
> Notice that I have created two commits, and they have the same parents,
> just in swapped order.

Here, Tomas is presumably reacting to Condition 2b in my procedure for

authenticating merge commits, which I will quote here again:

Toggle quote (20 lines)> For commits that have multiple parents - ie. merge commits - we weaken
> the invariant as follows:
>
> 1. If all parents have the primary introduction as their most recent
>    ancestor, then the invariant holds as usual.
>    
> 2. If one or more parents has the primary introduction as its most
>    recent ancestor (call these the 'primary parents'), and the rest have
>    any of the additional introductions, then the merge commit is
>    authenticated if and only if:
>    a) it's signed by a key authorized in all of the primary parents, AND
>    b) the /first parent/ [^] of the merge commit is a primary parent.
>    
> 3. If all parents have the same additional introduction as their most
>    recent ancestor, then the invariant holds as usual.
>
> 4. If none of the parents have the primary introduction as their most
>    recent ancestor, nor do they have the same additional introduction,
>    then the merge commit cannot be authenticated.

Now, it turns out that the parent order in a merge commit isn't actually
the relevant detail here. The parent order is a /UI detail/: it's a
convention that helps indicate in which direction a branch was merged
(and possibly other things), so that `git log` can show this to us, but
it doesn't actually affect the internal representation of the commit
graph.

The relevant detail is the fact that Tomas's observation should lead us
to remember - a Git commit graph doesn't include any information about
'merge order', ie. 'which branch was merged into which'. In fact it
doesn't include any information about /branches/ - those are just refs
that can be made to point to whatever commit you want, they are not part
of the commit graph.

Once we realise this, we can see that trying to control which branch can
be merged into which doesn't make sense.

This led my to think of an attack that's possible with my design - if I
want to screw with anyone `guix pull`ing from my fork, I can merge
upstream into my fork branch, add a bunch of malicious commits, and then
make the upstream branch ref point to the latest such commit. Now anyone
pulling from my fork will recieve the malicious commits as part of
upstream's history - since no commit hashes needed to change, the pull
is a regular fast-forward one, with no indication that anything is
amiss. Authentication will succeed since the malicious merge commit has
our fork as its (first) parent, and that parent has the primary
introduction as its most recent ancestor.

The takeaway here is that anyone authorized via the primary introduction
can fake new upstream commits.

So why bother with additional introductions at all, then? Because as far
as I can tell, they are still the only solution mentioned so far that
satisfies the requirements I mentioned earlier:

Toggle quote (5 lines)> not bumping the channel introduction (to avoid the increased attack
> surface from having to keep obtaining the updated one, as I discussed
> earlier), keeping fork history intact (to avoid force pulls), keeping
> upstream history intact, and being able to authenticate both upstream
> and fork commits

...and yes, you do have to trust the fork maintainer to not deliberately

mess those things up. But that's nothing new - even in the existing

design, we have to trust everyone who can make trusted commits not to

mess things up on purpose.

So what does this all of this mean for the statement of my design? Well,

it means that we need to stop thinking in terms of 'which branch can be

merged into which?' and more in terms of 'which merge commits can be

authenticated?'. And the answer to that question, with my design, would

be:

1. Any merge commit signed with a key in the intersection of its

parents' .guix_authorizations. (Standard authorization invariant.)

2. Any merge commit that doesn't meet the above conditions, but has a

parent whose most recent ancestor is the primary introduction, and is

signed by a key in the .guix_authorizations of that parent. (My

weakened authorization invariant.)

Finally, let me restate the conditions for authenticating merge commits,

taking this into account:

Toggle snippet (20 lines)For commits that have multiple parents - ie. merge commits - we weaken
the invariant as follows:

1. If all parents have the primary introduction as their most recent
   ancestor, then the invariant holds as usual.
   
2. If one or more parents has the primary introduction as its most
   recent ancestor (call these the 'primary parents'), and the rest have
   any of the additional introductions, then the merge commit is
   authenticated if and only if it's signed by a key authorized in all
   of the primary parents.
   
3. If all parents have the same additional introduction as their most
   recent ancestor, then the invariant holds as usual.
   
4. If none of the parents have the primary introduction as their most
   recent ancestor, nor do they have the same additional introduction,
   then the merge commit cannot be authenticated.

I merged 2a. into 2., and removed 2b.

Now let me try to respond to you:

Toggle quote (3 lines)

> Yeah, I think this scheme will still end up in [4]. As pointed out in

> [8], "primary" is just a convention that we can't rely on.

Not really. As I discussed, [8] points out that /merge order/ is the

convention that we can't rely on. Introductions can be deliberately

specified as primary or additional, whether via command-line flags or

separate sections in .git/config.

Toggle quote (7 lines)

> So let's just talk about the idea of widening one channel introduction

> to any number of channel introductions – we can always store a mapping

> of HEAD → first authenticated commit and then assert that this set is

> a subset of what we declare as introductions. �(This mapping will also

> make authentication as efficient as it currently is, since we don't

> need to reauthenticate everything all the time.)

I'm not sure what you mean. What do you mean by "mapping of HEAD → first

authenticated commit"? Does this perhaps mean 'all commits between the

latest one and the first authenticated commit'?

What does "assert that this set is a subset of what we declare as

introductions" mean?

Toggle quote (9 lines)

> Is this good enough? No: an attacker could easily add their own

> introduction and call it a day. In fact, this scheme is even worse

> than what was exploited in [4], because they never need commit access

> to the Guix repo to do so. Ahh, but wait! `guix pull` on the user's

> side uses their clean set of channels for authentication. Those only

> have upstream Guix… unless you actually pull your own fork or manage an

> attack as outlined below (in which case you do need commit access for

> some amount of time).

I should point out - my design does not require us to distribute any
introductions besides Guix's existing one, nor will it provide any
mechanism to automatically 'install' someone else's introduction. An
introduction is a tuple of (introductory commit, key that signs it) that
you specify as arguments to `guix git authenticate`. An attacker would
have to convince the entire Guix community to specify their (the
attacker's) own introduction on the command line (or directly add it
into .git/config). And given that there is no reason to ever do so
unless you're using someone's fork... that's a hard sell.

Perhaps I should have mentioned this when you suggested the attack below
in the first place.

Toggle quote (27 lines)>> > I think this might still hide a serious flaw.� With the way
>> > *upstream* authentication works.� Let's flip the example in [6]
>> > around a little bit and construct the following:
>> > 
>> > -A---B---C---D
>> > � \������ \
>> > �� \������ \-E---F---?
>> > ��� \�������������� /
>> > ���� \----G--H--I*-/
>> > � 
>> > Both A and I* are introductory commits on their various branches.�
>> > In ?, any committer who has valid keys in both F and I* can merge
>> > a branch with unsigned commits, effectively voiding the invariant
>> > of BCEF, e.g. by undoing any changes that happened there.� Of
>> > course, they can do so with signed commits as well, given that they
>> > have commit access to the main repository, but the point still
>> > holds that they may introduce unsigned commits to any fork where
>> > their key is valid in.
>> 
>> So, my design enables an attacker who can make authorized signed
>> commits to also introduce changes made in unsigned commits. Hmm.
>> 
>> I don't think this compromises our current security guarantees,
>> though?
> I mean, the promise we do make is that all commits starting from a
> certain commit are signed.  So IMHO, this effectively breaks that :)

Again, you need to deliberately use the attacker's introduction for this

to work. Unless you're pulling from their fork (in which case you

already trust them), there's no reason for them to ask you do so.

Toggle quote (9 lines)

>> If the attacker can already make trusted commits, then any attack

>> they can perform in the way you described can also be done directly

>> with signed commits onto F, as you pointed out. And the latter way

>> would be far simpler for them.

> Simpler, yes, but less stealthy. Most contributors don't concern

> themselves with the specifics of any particular branch, and you may

> even be able to dress up your evil branch as a good branch until the

> point where you finally merge it.

See above. We will never need to specify more than one introduction for

the main Guix repo, so this doesn't come up. We're not trying to enable

pull-request-style workflows within Guix; we're just trying to permit

authenticated forks.

Toggle quote (11 lines)>> Also, the branch they merged into would not contain any unsigned
>> commits; the commit '?' is still signed with a key authorized for
>> F's branch. So at most, we can say that the attacker can introduce
>> /changes made in/ unsigned commits, not 'introduce unsigned commits'.
> They can make an arbitrary number of unsigned commits before needing to
> sign off one commit that will be merged.  If they follow the style of
> merging master into their branch and then their branch into master,
> said commit can even be empty, though that would no longer be stealthy.
> Now if they were to I don't know, bump 9000 Rust packages or something
> like that, they have a lot of space to exploit the as-of-yet in this
> manner unexploited, but still weak SHA-1 ha

This message was truncated. Download the full message here.

45mg wrote 40 hours ago

Recipients:

Message-ID:87o706rizf.fsf@gmail.com

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

[...]

Toggle quote (5 lines)

> You can roll your own service definitions, but it does become harder

> when you want to keep all changes to that service from master as well.

> But `(use-modules (my-channel services nftables))` should pull that

> nftables code :)

[...]

Toggle quote (6 lines)

>> Then there is anything modifying any of the guix commands.� #74832 is

>> over month old, and as far as I know, I am not able to fix guix-copy

>> from a channel.� #72928 took over a month to merge, and again, not

>> sure how to patch guix-describe from a channel.

> Have you considered extensions?

[...]

Toggle quote (10 lines)>> (Yes, I am aware I can just copy&paste the service code into my
>> channel.  But at that point I am again just "replicating Guix", just
>> by more manual and error-prone means.� And even for packages,
>> adjusting system configuration to use package from my channel,
>> getting it merged and then adjusting back to upstream is annoying
>> chore.)
> You could code your channel in a way that it serves upstream stuff
> either silently or with a deprecation warning if a particular package
> is requested.  Not a channel, but [1] illustrates my point.

You could probably get all these ideas to work. But try to put yourself
in the shoes of someone who's just sent any of these patches. After
putting in the hard work to fix an issue by modifying the upstream code,
they now need to fix it AGAIN in a different way (via their channel).
Imagine trying to reimplement something as complex as the bootloader
subsystem rewrite [2] in your channel.

If you're going to have to incorporate every contribution into your
channel, you're heavily incentivized to only ever work on your channel,
and never bother sending any patches to upstream.

Toggle quote (3 lines)

> Cheers

> [1] https://git.ist.tugraz.at/clingabomino/clingabomino/-/blob/0.2.0/pkg/guix.scm?ref_type=tags#L30

45mg wrote 40 hours ago

Recipients:(address . 75552@debbugs.gnu.org)

Message-ID:87ikqerimt.fsf@gmail.com

Forwarding Attila's message here, because it wasn't sent to bug-guix, so

it may not have reached some of you and won't show up in the issue

tracker.

As far as I can tell, this is exactly what the 'rebase' approach

mentioned upthread would look like in practice. As mentioned, it has the

problem of having to bump the introduction every time, and I've written

about the security aspects of this (beginning of [1]). Also, as Attila

notes, it's burdensome.

[1] https://lists.gnu.org/archive/html/bug-guix/2025-01/msg00135.html

-------------------- Start of forwarded message --------------------

To: 45mg <45mg.writes@gmail.com>

Cc: Felix Lechner <felix.lechner@lease-up.com>, Tomas Volf <~@wolfsden.cz>, help-guix@gnu.org, guix-devel@gnu.org

i haven't read the entire thread [sorry], but with that in mind here's how i'm solving this:

i have various branches where i keep my not-yet-merged work. i also have a script that creates/overwrites a branch (called 'attila', starting at the tag 'attila-baseline') and cherry picks everything into it. i sometimes `git tag -f` the 'attila-baseline' tag to pick a new baseline.

then i update my intro commit hash wherever i want to pull my rebased/cherry-picked changes (this is a several machines setup, and yes, it's burdensome).

when a cherry pick fails, then i cancel the script, rebase the problematic branch on 'attila-baseline', and restart the script pasted below.

• attila lendvai

• PGP: 963F 5D5F 45C7 DFCD 0A39

“Is there an idea more radical in the history of the human race than turning your children over to total strangers whom you know nothing about, and having those strangers work on your child's mind, out of your sight, for a period of twelve years? […] Back in Colonial days in America, if you proposed that kind of idea, they'd burn you at the stake, you mad person! It's a mad idea!”

— John Taylor Gatto (1935–2018), Teacher of the Year, both in New York City and State, multiple times

#!/usr/bin/env bash

BRANCHES="kludges ui-warnings print-branch-name"

BRANCHES+=" shepherd-guix-side"

set -e

initial_branch=$(git branch --show-current)

git rebase attila-baseline attila-initial-commit

git checkout attila

git reset --hard attila-baseline

git pull . attila-initial-commit

for branch in ${BRANCHES}; do

echo "*** Processing branch ${branch}"

#git rebase attila-baseline $branch

git cherry-pick attila-baseline..$branch

done

#git checkout $initial_branch

git -c pager.log=false log --pretty=oneline attila-initial-commit~1..attila-initial-commit

-------------------- End of forwarded message --------------------

Tomas Volf wrote 39 hours ago

Re: Non-committers can't keep authenticated forks updated

Recipients:(name . Ricardo Wurmus)(address . rekado@elephly.net)

Message-ID:87wmeurgjv.fsf@wolfsden.cz

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (13 lines)> Liliana Marie Prikler <liliana.prikler@gmail.com> writes:
>
>>> This has the slight issue that I can no longer easily answer a
>>> question "is this commit in my fork", since I cannot search by the
>>> commit hash. I admit it is not a question I need to answer often
>>> (last time was on 21st of October, CVE-2024-52867).
>> You could solve this by embedding an "upstream-commit:" trailer, but
>> that is an admittedly cursed transformation that no longer maps to a
>> single rebase, I admit.
>
> You can use the Change-Id tag for this.  Our local hooks create them and
> they stay intact after rebase.

Oh, right. That was not a thing yet when I started my tree, but these

days it would work (with the obvious detail that people use commit

hashes, not change-ids, in emails, so I would need to translate first).

But yes, it still is large improvement over matching commit messages.

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Liliana Marie Prikler wrote 39 hours ago

Recipients:

Message-ID:ef18ee3dc4813b20fe40c21ec0f03e856acdb9a0.camel@gmail.com

Am Donnerstag, dem 16.01.2025 um 13:10 +0000 schrieb 45mg:

Toggle quote (6 lines)> As the 'Authenticate Your Git Checkouts'
> blog post [9] pointed out, we wouldn't need `guix git authenticate`
> if we were willing to delegate our security to a trusted third party,
> like all the open source projects that sport those "fancy “?
> verified” badges as found on GitLab and on GitHub" do. We shouldn't
> force anyone hosting a fork to do so as well.

I mean, you can host your own fork and use the fancy “? verified” badge
of your host as source of trust – it just won't be checked by `guix
pull', that's all.  If you do do that, I'd recommend using a file://
URI with a local checkout for your channel, so that you can verify that
little check mark on your own (then you only need to trust your own
file system).

Toggle quote (13 lines)> > 
> > I think you're making this more complicated than it needs to be. 
> > checkout, authenticate, rebase*, merge*�ought to have you covered.
> > 
> > * you can authenticate after these if you're paranoid 
> 
> The complexity is due to the requirements of not bumping the channel
> introduction (to avoid the increased attack surface from having to
> keep obtaining the updated one, as I discussed earlier), keeping fork
> history intact (to avoid force pulls), keeping upstream history
> intact, and being able to authenticate both upstream and fork
> commits. If you can think of a simpler method that meets these
> requirements, I'd love to hear it.

Guix committers are more than happy to use work trees and rebases,
which simplify this a lot – again, it's as simple as pull,
authenticate, rebase.

W.r.t. keeping history intact, we had the following exchange on IRC
yesterday.

<Rutherther>	lfam: that's interesting that there is really a merge
commit, for example if I remember right, the core updates merge few
months ago happened by directly appending the commits instead of a
merge commit
<lfam>	Yes, there are two ways to do it (rebase and merge) and it's a
matter of taste
<lfam>	Of course there is a correct choice, as with all questions of
taste ;)
<Rutherther>	I personally prefer a merge commit, since it has two
parents, you can track where the previous master pointed to
<lfam>	And I prefer a rebase. But ultimately it's up to whoever is in
front of the keyboard
<lilyp>	FWIW, a rebase is cleaner, but requires that only one person
signs off commits on any given branch (or else you're signing commits
that someone else signed before and have to update the trailer… not
ideal)
<lfam>	It doesn't matter who signs the commits as long as they are
authorized. That's the security model we have

So yeah, even for (branch-)local work at least some committers prefer
rebasing.

Toggle quote (16 lines)> > No, it wouldn't.� You would rebase those changes on top of what you
> > already have on those respective branches.
> 
> It looks like at least one of us is misunderstanding rebasing. Could
> be me, so I'm consulting the relevant chapter from the Pro Git book
> [11] for a refresher.
> 
> Let's imagine that the first example given there represents our fork
> of Guix, where the 'experiment' branch marks the beginning of our
> fork (and its channel introduction) and the 'master' branch tracks
> upstream Guix.  After `git rebase master`, the commit that used to be
> C4 is gone, and now C4' takes its place. It may contain the same
> changes, but it's a different commit - so it (and any commits that
> it's the parent of) has a different hash. So the channel introduction
> has changed, and so has the entire history of the `experimental`
> branch. So we need to force-pull.

Yes, that's one variant – the one where you need to keep bumping your

channel introductions. The other direction would be to rebase Guix

changes on top of your local branch. This keeps your channel

introduction as-is.

Toggle quote (15 lines)> > > 
> […]
> This led my to think of an attack that's possible with my design - if
> I want to screw with anyone `guix pull`ing from my fork, I can merge
> upstream into my fork branch, add a bunch of malicious commits, and
> then make the upstream branch ref point to the latest such commit.
> Now anyone pulling from my fork will recieve the malicious commits as
> part of upstream's history - since no commit hashes needed to change,
> the pull is a regular fast-forward one, with no indication that
> anything is amiss. Authentication will succeed since the malicious
> merge commit has our fork as its (first) parent, and that parent has
> the primary introduction as its most recent ancestor.
> 
> The takeaway here is that anyone authorized via the primary
> introduction can fake new upstream commits.

Care to state how designating one introduction as "primary" adds to

security here?

Toggle quote (12 lines)> So why bother with additional introductions at all, then? Because as
> far as I can tell, they are still the only solution mentioned so far
> that satisfies the requirements I mentioned earlier:
> > not bumping the channel introduction (to avoid the increased attack
> > surface from having to keep obtaining the updated one, as I
> > discussed earlier), keeping fork history intact (to avoid force
> > pulls), keeping upstream history intact, and being able to
> > authenticate both upstream and fork commits
> ...and yes, you do have to trust the fork maintainer to not
> deliberately mess those things up. But that's nothing new - even in
> the existing design, we have to trust everyone who can make trusted
> commits not to mess things up on purpose.

You are trading one attack surface for another. Again, all is fine

while you only have to trust yourself, but weakening an invariant is

weakening an invariant (:

Toggle quote (13 lines)> So what does this all of this mean for the statement of my design?
> Well, it means that we need to stop thinking in terms of 'which
> branch can be merged into which?' and more in terms of 'which merge
> commits can be authenticated?'. And the answer to that question, with
> my design, would be:
> 
> 1. Any merge commit signed with a key in the intersection of its
> �� parents' .guix_authorizations. (Standard authorization invariant.)
> 
> 2. Any merge commit that doesn't meet the above conditions, but has a
> �� parent whose most recent ancestor is the primary introduction, and
>    is signed by a key in the .guix_authorizations of that parent. (My
> �� weakened authorization invariant.)

That's a pretty long way of saying "Any merge commit signed with a key

in one of its parents' .guix_authorizations." It is (by your design)

trivial to have the "primary introduction" under your control.

Toggle quote (47 lines)> Finally, let me restate the conditions for authenticating merge
> commits, taking this into account:
> 
> --8<---------------cut here---------------start------------->8---
> For commits that have multiple parents - ie. merge commits - we
> weaken the invariant as follows:
> 
> 1. If all parents have the primary introduction as their most recent
> �� ancestor, then the invariant holds as usual.
> �� 
> 2. If one or more parents has the primary introduction as its most
> �� recent ancestor (call these the 'primary parents'), and the rest
>    have any of the additional introductions, then the merge commit is
> �� authenticated if and only if it's signed by a key authorized in 
>    all of the primary parents.
> �� 
> 3. If all parents have the same additional introduction as their most
> �� recent ancestor, then the invariant holds as usual.
> �� 
> 4. If none of the parents have the primary introduction as their most
> �� recent ancestor, nor do they have the same additional
>    introduction, then the merge commit cannot be authenticated.
> --8<---------------cut here---------------end--------------->8---
> 
> I merged 2a. into 2., and removed 2b.
> 
> Now let me try to respond to you:
> 
> > Yeah, I think this scheme will still end up in [4].� As pointed out
> > in [8], "primary" is just a convention that we can't rely on.
> 
> Not really. As I discussed, [8] points out that /merge order/ is the
> convention that we can't rely on. Introductions can be deliberately
> specified as primary or additional, whether via command-line flags or
> separate sections in .git/config.
> 
> > So let's just talk about the idea of widening one channel
> > introduction to any number of channel introductions – we can always
> > store a mapping of HEAD → first authenticated commit and then
> > assert that this set is a subset of what we declare as
> > introductions. �(This mapping will also make authentication as
> > efficient as it currently is, since we don't need to reauthenticate
> > everything all the time.)
> 
> I'm not sure what you mean. What do you mean by "mapping of HEAD →
> first authenticated commit"? Does this perhaps mean 'all commits
> between the latest one and the first authenticated commit'?

Little refresher: Guix stores a list of already authenticated commits
so as to not redo this work all over again.  If we were to allow
multiple introductions, we would also need to find the first
authenticated commit among them to match against the channel
introductions.

Toggle quote (2 lines)

> What does "assert that this set is a subset of what we declare as

> introductions" mean?

Let's say that you work on branches B, C, and D with "primary"

introductions I, J , and K. If you want to merge C into B, you need to

remember that B has I as its primary introduction, C has J, and so on.

Toggle quote (12 lines)> > Is this good enough?� No: an attacker could easily add their own
> > introduction and call it a day.� In fact, this scheme is even worse
> > than what was exploited in [4], because they never need commit
> > access to the Guix repo to do so.� Ahh, but wait!� `guix pull` on
> > the user's side uses their clean set of channels for
> > authentication.� Those only have upstream Guix… unless you actually
> > pull your own fork or manage an attack as outlined below (in which
> > case you do need commit access for some amount of time).
> 
> I should point out - my design does not require us to distribute any
> introductions besides Guix's existing one, nor will it provide any
> mechanism to automatically 'install' someone else's introduction.

Yes it will, per `%default-guix-channel'.

Toggle quote (7 lines)> An introduction is a tuple of (introductory commit, key that signs
> it) that you specify as arguments to `guix git authenticate`. An
> attacker would have to convince the entire Guix community to specify
> their (the attacker's) own introduction on the command line (or
> directly add it into .git/config). And given that there is no reason
> to ever do so unless you're using someone's fork... that's a hard
> sell.

Well, another hard sell would be introducing a feature to `guix git-
authenticate` that must not ever be used in Guix itself.  Now, since
you are already soft-forking Guix, you can obviously add this to your
own guix command, but do beware the dragons you're summoning with it.

Cheers

Tomas Volf wrote 39 hours ago

Recipients:(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)

Message-ID:87sepirfez.fsf@wolfsden.cz

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (8 lines)

> [..]

>> Then there is anything modifying any of the guix commands.� #74832 is

>> over month old, and as far as I know, I am not able to fix guix-copy

>> from a channel.� #72928 took over a month to merge, and again, not

>> sure how to patch guix-describe from a channel.

> Have you considered extensions?

Took me a while to figure out what you mean. Apparently there is a

$GUIX_EXTENSIONS_PATH environment variables that can be used to add

sub-commands (does not seem to be documented at all in the manual?).

Maybe I am looking into wrong place.

If I understand it correctly, I could copy&paste the current code for

`guix describe' and add it as `guix describe*' with my modification.

However it feels bit convoluted.

All of these things discussed in this thread are technically possible.

But I think that we all agree that the friction involved, compared to

just using my own fork with the patch applied, is much larger, at least

in my opinion.

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

45mg wrote 36 hours ago

Recipients:

Message-ID:87cygmmzuh.fsf@gmail.com

Hi Liliana,

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (14 lines)> Am Donnerstag, dem 16.01.2025 um 13:10 +0000 schrieb 45mg:
>> As the 'Authenticate Your Git Checkouts'
>> blog post [9] pointed out, we wouldn't need `guix git authenticate`
>> if we were willing to delegate our security to a trusted third party,
>> like all the open source projects that sport those "fancy “?
>> verified” badges as found on GitLab and on GitHub" do. We shouldn't
>> force anyone hosting a fork to do so as well.
> I mean, you can host your own fork and use the fancy “? verified” badge
> of your host as source of trust – it just won't be checked by `guix
> pull', that's all.  If you do do that, I'd recommend using a file://
> URI with a local checkout for your channel, so that you can verify that
> little check mark on your own (then you only need to trust your own
> file system).

Yeah, I know I can. My point is that people who use remote forks

shouldn't have to rely on a trusted third party. We've figured out a way

for upstream Guix not to have to, now let's try to extend that to forks.

Toggle quote (16 lines)>> > I think you're making this more complicated than it needs to be. 
>> > checkout, authenticate, rebase*, merge*�ought to have you covered.
>> > 
>> > * you can authenticate after these if you're paranoid 
>> 
>> The complexity is due to the requirements of not bumping the channel
>> introduction (to avoid the increased attack surface from having to
>> keep obtaining the updated one, as I discussed earlier), keeping fork
>> history intact (to avoid force pulls), keeping upstream history
>> intact, and being able to authenticate both upstream and fork
>> commits. If you can think of a simpler method that meets these
>> requirements, I'd love to hear it.
> Guix committers are more than happy to use work trees and rebases,
> which simplify this a lot – again, it's as simple as pull,
> authenticate, rebase.

No doubt this works for those with commit access. The title of this

issue is 'Non-committers can't keep authenticated forks updated'.

Since most future committers will take years to attain that status, and

many (most?) Guix contributors can't commit (heh) to being committers, I

think it would be a good thing for them to be able to make use of our

security mechanisms.

(Actually, if you mean the variant of rebasing you describe below, then

I see what you mean, never mind)

Toggle quote (25 lines)> W.r.t. keeping history intact, we had the following exchange on IRC
> yesterday.
>
> <Rutherther>	lfam: that's interesting that there is really a merge
> commit, for example if I remember right, the core updates merge few
> months ago happened by directly appending the commits instead of a
> merge commit
> <lfam>	Yes, there are two ways to do it (rebase and merge) and it's a
> matter of taste
> <lfam>	Of course there is a correct choice, as with all questions of
> taste ;)
> <Rutherther>	I personally prefer a merge commit, since it has two
> parents, you can track where the previous master pointed to
> <lfam>	And I prefer a rebase. But ultimately it's up to whoever is in
> front of the keyboard
> <lilyp>	FWIW, a rebase is cleaner, but requires that only one person
> signs off commits on any given branch (or else you're signing commits
> that someone else signed before and have to update the trailer… not
> ideal)
> <lfam>	It doesn't matter who signs the commits as long as they are
> authorized. That's the security model we have
>
> So yeah, even for (branch-)local work at least some committers prefer
> rebasing.

That seems to be a discussion about a merge commit in upstream Guix, not

about the kind of merge commits that I'm trying to allow.

Again, not disputing that things work fine for people with commit

access. Perhaps that is part of why this issue hasn't been addressed

before :P

Toggle quote (21 lines)>> > No, it wouldn't.� You would rebase those changes on top of what you
>> > already have on those respective branches.
>> 
>> It looks like at least one of us is misunderstanding rebasing. Could
>> be me, so I'm consulting the relevant chapter from the Pro Git book
>> [11] for a refresher.
>> 
>> Let's imagine that the first example given there represents our fork
>> of Guix, where the 'experiment' branch marks the beginning of our
>> fork (and its channel introduction) and the 'master' branch tracks
>> upstream Guix.  After `git rebase master`, the commit that used to be
>> C4 is gone, and now C4' takes its place. It may contain the same
>> changes, but it's a different commit - so it (and any commits that
>> it's the parent of) has a different hash. So the channel introduction
>> has changed, and so has the entire history of the `experimental`
>> branch. So we need to force-pull.
> Yes, that's one variant – the one where you need to keep bumping your
> channel introductions.  The other direction would be to rebase Guix
> changes on top of your local branch.  This keeps your channel
> introduction as-is.

Ah, I see. Actually, I think that might work... if I create a
'upstream-backup' branch before rebasing, and reset the 'upstream'
branch to that branch after, I can keep the full history of upstream,
and authenticate it separately. And thanks to Ricardo's suggestion [12]
to compare by Change-ID, it should even be possible to find corresponding
commits between my fork and upstream. Looks like this solution meets all
four requirements that I stated earlier, and it wouldn't be TOO annoying
or complicated to script either.

One limitation I can think of is that you can't really verify whether a
commit ostensibly rebased from master is actually the same commit
(imagine 100 commits in a rebase, are you going to check the diff for
each? Versus with a merge you only have to check that the merge commit
looks sane). Then again, this is really only a problem if you're using
someone ELSE's fork and need to ensure they've not gone mad or evil,
which is not my personal use-case, and perhaps not that common.

You know what, I think I'll give it a try. If there aren't any unforseen
complications, then this solution may be good enough for me, and perhaps
good enough to add to the documentation as a recommendation so that
others don't have to puzzle this out themselves. Thanks for giving me
the idea!

(At any rate, feel free to reply to my points here in the meantime; If
it turns out that the rebase workflow is insufficient for some reason
then I can pick up where we left off)

Toggle quote (18 lines)>> > > 
>> […]
>> This led my to think of an attack that's possible with my design - if
>> I want to screw with anyone `guix pull`ing from my fork, I can merge
>> upstream into my fork branch, add a bunch of malicious commits, and
>> then make the upstream branch ref point to the latest such commit.
>> Now anyone pulling from my fork will recieve the malicious commits as
>> part of upstream's history - since no commit hashes needed to change,
>> the pull is a regular fast-forward one, with no indication that
>> anything is amiss. Authentication will succeed since the malicious
>> merge commit has our fork as its (first) parent, and that parent has
>> the primary introduction as its most recent ancestor.
>> 
>> The takeaway here is that anyone authorized via the primary
>> introduction can fake new upstream commits.
> Care to state how designating one introduction as "primary" adds to
> security here?  

The problem I'm trying to solve is that you can't merge upstream into
your fork unless you sign the merge commit with a key authorized in
upstream, because of the intersections-of-parent-authorizations design.
Tomas's solution was to do away with the 'intersections' requirement,
and allow a key that's authorized for any parent to sign the merge
commit; this is vulnerable to attacks like [4]. My solution falls
somewhere in between - we keep the 'intersections' requirement, /except/
when one of the parent commits is a descendant of an introduction
designated as the 'primary introduction'.

If you drop the 'primary introduction' designation (make all of them
'primary'), then you're basically dropping the intersections
requirement. IOW, we've returned to the situation in Tomas's
solution, where [4] is possible - anyone with even a single signed
commit in Guix can now create a merge commit between Guix and your fork,
even if their key was later removed from Guix.

Toggle quote (16 lines)>> So why bother with additional introductions at all, then? Because as
>> far as I can tell, they are still the only solution mentioned so far
>> that satisfies the requirements I mentioned earlier:
>> > not bumping the channel introduction (to avoid the increased attack
>> > surface from having to keep obtaining the updated one, as I
>> > discussed earlier), keeping fork history intact (to avoid force
>> > pulls), keeping upstream history intact, and being able to
>> > authenticate both upstream and fork commits
>> ...and yes, you do have to trust the fork maintainer to not
>> deliberately mess those things up. But that's nothing new - even in
>> the existing design, we have to trust everyone who can make trusted
>> commits not to mess things up on purpose.
> You are trading one attack surface for another.  Again, all is fine
> while you only have to trust yourself, but weakening an invariant is
> weakening an invariant (:

Which attack surface am I introducing or worsening? The 'authorized
committer can mess things up' attack surface was already /there/. The
attack I described does not make it worse AFAICT - its potential impact
is to compromise the ability to easily authenticate all upstream and
fork commits in one go, but that ability is not something we previously
had in the first place.

Toggle quote (16 lines)>> So what does this all of this mean for the statement of my design?
>> Well, it means that we need to stop thinking in terms of 'which
>> branch can be merged into which?' and more in terms of 'which merge
>> commits can be authenticated?'. And the answer to that question, with
>> my design, would be:
>> 
>> 1. Any merge commit signed with a key in the intersection of its
>> �� parents' .guix_authorizations. (Standard authorization invariant.)
>> 
>> 2. Any merge commit that doesn't meet the above conditions, but has a
>> �� parent whose most recent ancestor is the primary introduction, and
>>    is signed by a key in the .guix_authorizations of that parent. (My
>> �� weakened authorization invariant.)
> That's a pretty long way of saying "Any merge commit signed with a key
> in one of its parents' .guix_authorizations."

Not quite. Merge commits between upstream Guix and your fork, whose
signing key is in the .guix_authorizations of its parent in upstream but
not in the .guix_authorizations of its parent in your fork, would not be
authenticated. (This is the kind of merge commit that would be used for
[4].)

Toggle quote (3 lines)

> It is (by your design) trivial to have the "primary introduction"

> under your control.

Yes, as long as you're using your own fork this will be the case.

Toggle quote (53 lines)>> Finally, let me restate the conditions for authenticating merge
>> commits, taking this into account:
>> 
>> --8<---------------cut here---------------start------------->8---
>> For commits that have multiple parents - ie. merge commits - we
>> weaken the invariant as follows:
>> 
>> 1. If all parents have the primary introduction as their most recent
>> �� ancestor, then the invariant holds as usual.
>> �� 
>> 2. If one or more parents has the primary introduction as its most
>> �� recent ancestor (call these the 'primary parents'), and the rest
>>    have any of the additional introductions, then the merge commit is
>> �� authenticated if and only if it's signed by a key authorized in 
>>    all of the primary parents.
>> �� 
>> 3. If all parents have the same additional introduction as their most
>> �� recent ancestor, then the invariant holds as usual.
>> �� 
>> 4. If none of the parents have the primary introduction as their most
>> �� recent ancestor, nor do they have the same additional
>>    introduction, then the merge commit cannot be authenticated.
>> --8<---------------cut here---------------end--------------->8---
>> 
>> I merged 2a. into 2., and removed 2b.
>> 
>> Now let me try to respond to you:
>> 
>> > Yeah, I think this scheme will still end up in [4].� As pointed out
>> > in [8], "primary" is just a convention that we can't rely on.
>> 
>> Not really. As I discussed, [8] points out that /merge order/ is the
>> convention that we can't rely on. Introductions can be deliberately
>> specified as primary or additional, whether via command-line flags or
>> separate sections in .git/config.
>> 
>> > So let's just talk about the idea of widening one channel
>> > introduction to any number of channel introductions – we can always
>> > store a mapping of HEAD → first authenticated commit and then
>> > assert that this set is a subset of what we declare as
>> > introductions. �(This mapping will also make authentication as
>> > efficient as it currently is, since we don't need to reauthenticate
>> > everything all the time.)
>> 
>> I'm not sure what you mean. What do you mean by "mapping of HEAD →
>> first authenticated commit"? Does this perhaps mean 'all commits
>> between the latest one and the first authenticated commit'?
> Little refresher: Guix stores a list of already authenticated commits
> so as to not redo this work all over again.  If we were to allow
> multiple introductions, we would also need to find the first
> authenticated commit among them to match against the channel
> introductions.

Ok, so I guess you meant what I thought you did.

We can cache the introduction of each commit. In fact, that could

actually help block the attack I described in my previous mail [12], as

in that attack the introduction for all the upstream commits suddenly

changes to the primary introduction; if we manage to detect this, the

conflict with the cached introduction can tell us that something is off.

It's not a real defense as it wouldn't work on a fresh clone, but it's

something, at least.

Toggle quote (6 lines)

>> What does "assert that this set is a subset of what we declare as

>> introductions" mean?

> Let's say that you work on branches B, C, and D with "primary"

> introductions I, J , and K. If you want to merge C into B, you need to

> remember that B has I as its primary introduction, C has J, and so on.

Ah, got it. Yes, that's correct. (Except that there's only one primary

introduction, at least as I intended it.)

Toggle quote (14 lines)>> > Is this good enough?� No: an attacker could easily add their own
>> > introduction and call it a day.� In fact, this scheme is even worse
>> > than what was exploited in [4], because they never need commit
>> > access to the Guix repo to do so.� Ahh, but wait!� `guix pull` on
>> > the user's side uses their clean set of channels for
>> > authentication.� Those only have upstream Guix… unless you actually
>> > pull your own fork or manage an attack as outlined below (in which
>> > case you do need commit access for some amount of time).
>> 
>> I should point out - my design does not require us to distribute any
>> introductions besides Guix's existing one, nor will it provide any
>> mechanism to automatically 'install' someone else's introduction.
> Yes it will, per `%default-guix-channel'.

Ok, technically true. And someone with the ability to make trusted

commits upstream could modify that so that a fresh `guix pull` skips

whatever malicious stuff they've done. But my solution doesn't make this

situation any worse AFAICT.

Toggle quote (12 lines)>> An introduction is a tuple of (introductory commit, key that signs
>> it) that you specify as arguments to `guix git authenticate`. An
>> attacker would have to convince the entire Guix community to specify
>> their (the attacker's) own introduction on the command line (or
>> directly add it into .git/config). And given that there is no reason
>> to ever do so unless you're using someone's fork... that's a hard
>> sell.
> Well, another hard sell would be introducing a feature to `guix git-
> authenticate` that must not ever be used in Guix itself.  Now, since
> you are already soft-forking Guix, you can obviously add this to your
> own guix command, but do beware the dragons you're summoning with it.

By that logic, we would never have gotten the ability to specify

multiple channels, since that isn't used in upstream Guix and doesn't

need to be.

Toggle quote (2 lines)

> Cheers

[12] https://lists.gnu.org/archive/html/bug-guix/2025-01/msg00130.html

[13] https://lists.gnu.org/archive/html/bug-guix/2025-01/msg00135.html

Liliana Marie Prikler wrote 33 hours ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)

Message-ID:75a9d78bcdab5a032814607a0e5ca58d5fbcaa2b.camel@gmail.com

Am Donnerstag, dem 16.01.2025 um 15:39 +0100 schrieb Tomas Volf:

Toggle quote (14 lines)> Liliana Marie Prikler <liliana.prikler@gmail.com> writes:
> 
> > [..]
> > 
> > > Then there is anything modifying any of the guix commands.�
> > > #74832 is over month old, and as far as I know, I am not able to
> > > fix guix-copy from a channel.� #72928 took over a month to merge,
> > > and again, not sure how to patch guix-describe from a channel.
> > Have you considered extensions?
> 
> Took me a while to figure out what you mean.� Apparently there is a
> $GUIX_EXTENSIONS_PATH environment variables that can be used to add
> sub-commands (does not seem to be documented at all in the manual?).
> Maybe I am looking into wrong place.

Good point, we do lack documentation for that. It was mentioned in a

blog post IIRC, but newcomers won't necessarily find that.

Toggle quote (3 lines)

> If I understand it correctly, I could copy&paste the current code for

> `guix describe' and add it as `guix describe*' with my modification.

> However it feels bit convoluted.

In practice, you import the module and then add your changes on top

with the minimum amount of work possible. E.g. for `guix describe*`,

you could first run `guix describe` and then print your own output

afterwards.

Toggle quote (4 lines)

> All of these things discussed in this thread are technically

> possible. But I think that we all agree that the friction involved,

> compared to just using my own fork with the patch applied, is much

> larger, at least in my opinion.

Yes, we can agree that this is your opinion.  We can even agree that
there is more friction, but I'm not sure whether we agree on the value
of "much".  But honestly speaking, the friction with contributing to
upstream is much more a social one (too few people to review) than a
technical one, and soft forks are a band aid that will likely burn you
out even sooner the more you commit to them.

Cheers

Saturanya Rahjane de Lasca wrote 32 hours ago

Recipients:

Message-ID:3938fe5cb9ce8c6c26bd24937e32bedaae4f301e.camel@dismail.de

Am Donnerstag, dem 16.01.2025 um 17:29 +0000 schrieb 45mg:

Toggle quote (4 lines)

> Yeah, I know I can. My point is that people who use remote forks

> shouldn't have to rely on a trusted third party. We've figured out a

> way for upstream Guix not to have to, now let's try to extend that to

> forks.

Well, the same way already works for channels and hard forks. It

really is soft forks that experience this issue – and even then there's

ways around it, as discussed.

Toggle quote (5 lines)> > > 
> Since most future committers will take years to attain that status,
> and many (most?) Guix contributors can't commit (heh) to being
> committers, I think it would be a good thing for them to be able to
> make use of our security mechanisms.

They already can, see above.

I don't think asking would-be committers to soft-fork Guix is

worthwhile idea even with the proposed change. Authoring your own

channel and contributing bits back to Guix suffices – and it really is

the things you contribute to Guix proper rather than your channels and

soft forks that make the cut.

Toggle quote (9 lines)> > W.r.t. keeping history intact, we had the following exchange on IRC
> > yesterday.
> > 
> > […]
> > So yeah, even for (branch-)local work at least some committers
> > prefer rebasing.
> 
> That seems to be a discussion about a merge commit in upstream Guix,
> not about the kind of merge commits that I'm trying to allow.

It is. I just wanted to give some context.

Toggle quote (3 lines)

> Again, not disputing that things work fine for people with commit

> access. Perhaps that is part of why this issue hasn't been addressed

> before :P

You may call us privileged – and yes, we are – but that doesn't change

the fact that weakening security weakens security.

Toggle quote (24 lines)> > > 
> > > Let's imagine that the first example given there represents our
> > > fork of Guix, where the 'experiment' branch marks the beginning
> > > of our fork (and its channel introduction) and the 'master'
> > > branch tracks upstream Guix.� After `git rebase master`, the
> > > commit that used to be C4 is gone, and now C4' takes its place.
> > > It may contain the same changes, but it's a different commit - so
> > > it (and any commits that it's the parent of) has a different
> > > hash. So the channel introduction has changed, and so has the
> > > entire history of the `experimental` branch. So we need to force-
> > > pull.
> > Yes, that's one variant – the one where you need to keep bumping
> > your channel introductions.� The other direction would be to rebase
> > Guix changes on top of your local branch.� This keeps your channel
> > introduction as-is.
> 
> Ah, I see. Actually, I think that might work... if I create a
> 'upstream-backup' branch before rebasing, and reset the 'upstream'
> branch to that branch after, I can keep the full history of upstream,
> and authenticate it separately. And thanks to Ricardo's suggestion
> [12] to compare by Change-ID, it should even be possible to find
> corresponding commits between my fork and upstream. Looks like this
> solution meets all four requirements that I stated earlier, and it
> wouldn't be TOO annoying or complicated to script either.

Yes, it's the workflow I alluded to earlier: pull, authenticate, rebase

on your branch.

Toggle quote (8 lines)> One limitation I can think of is that you can't really verify whether
> a commit ostensibly rebased from master is actually the same commit
> (imagine 100 commits in a rebase, are you going to check the diff for
> each? Versus with a merge you only have to check that the merge
> commit looks sane). Then again, this is really only a problem if
> you're using someone ELSE's fork and need to ensure they've not gone
> mad or evil, which is not my personal use-case, and perhaps not that
> common.

Yeah, it really isn't too hard to think of 9000 commits bumping rust-

whatever. We're not yet calling that Tuesday, but only because we lag

behind the Rust ecosystem as a whole.

Toggle quote (36 lines)> > > > 
> > > […]
> > > This led my to think of an attack that's possible with my design
> > > - if I want to screw with anyone `guix pull`ing from my fork, I
> > > can merge upstream into my fork branch, add a bunch of malicious
> > > commits, and then make the upstream branch ref point to the
> > > latest such commit.  Now anyone pulling from my fork will recieve
> > > the malicious commits as part of upstream's history - since no
> > > commit hashes needed to change, the pull is a regular fast-
> > > forward one, with no indication that anything is amiss.
> > > Authentication will succeed since the malicious merge commit has
> > > our fork as its (first) parent, and that parent has the primary
> > > introduction as its most recent ancestor.
> > > 
> > > The takeaway here is that anyone authorized via the primary
> > > introduction can fake new upstream commits.
> > Care to state how designating one introduction as "primary" adds to
> > security here?� 
> 
> The problem I'm trying to solve is that you can't merge upstream into
> your fork unless you sign the merge commit with a key authorized in
> upstream, because of the intersections-of-parent-authorizations
> design.
> Tomas's solution was to do away with the 'intersections' requirement,
> and allow a key that's authorized for any parent to sign the merge
> commit; this is vulnerable to attacks like [4]. My solution falls
> somewhere in between - we keep the 'intersections' requirement,
> /except/ when one of the parent commits is a descendant of an
> introduction designated as the 'primary introduction'.
> 
> If you drop the 'primary introduction' designation (make all of them
> 'primary'), then you're basically dropping the intersections
> requirement. IOW, we've returned to the situation in Tomas's
> solution, where [4] is possible - anyone with even a single signed
> commit in Guix can now create a merge commit between Guix and your
> fork, even if their key was later removed from Guix.

Yeah, the point I'm trying to make here is that you still end up with

Tomas' solution as you can simply designate whatever primary you want.

The primary is not special in that sense.

Toggle quote (25 lines)> > 
> > > So what does this all of this mean for the statement of my
> > > design? Well, it means that we need to stop thinking in terms of
> > > 'which branch can be merged into which?' and more in terms of
> > > 'which merge commits can be authenticated?'. And the answer to
> > > that question, with my design, would be:
> > > 
> > > 1. Any merge commit signed with a key in the intersection of its
> > > �� parents' .guix_authorizations. (Standard authorization
> > >    invariant.)
> > > 
> > > 2. Any merge commit that doesn't meet the above conditions, but
> > >    has a parent whose most recent ancestor is the primary 
> > >    introduction, and is signed by a key in the
> > >    .guix_authorizations of that parent. (My
> > > �� weakened authorization invariant.)
> > That's a pretty long way of saying "Any merge commit signed with a
> > key in one of its parents' .guix_authorizations."
> 
> Not quite. Merge commits between upstream Guix and your fork, whose
> signing key is in the .guix_authorizations of its parent in upstream
> but not in the .guix_authorizations of its parent in your fork, would
> not be authenticated. (This is the kind of merge commit that would be
> used for [4].)

So merge commits for upstream will look like

---X---Y---Z-\

(M)

---A---B---C-/

whereas merge commits for your fork will look like

---X---Y---Z-\

(M)

---A---B---C-/

Uhm… how does `guix git authenticate' differentiate between the two?

Toggle quote (11 lines)> 
> > 
> > > What does "assert that this set is a subset of what we declare as
> > > introductions" mean?
> > Let's say that you work on branches B, C, and D with "primary"
> > introductions I, J , and K.� If you want to merge C into B, you
> > need to remember that B has I as its primary introduction, C has J,
> > and so on.
> 
> Ah, got it. Yes, that's correct. (Except that there's only one
> primary introduction, at least as I intended it.)

When those branches tail different upstreams, they won't necessarily

have the same primary introduction – at least as you intended it.

Toggle quote (10 lines)> > > 
> > > my design does not require us to distribute any introductions
> > > besides Guix's existing one, nor will it provide any mechanism to
> > > automatically 'install' someone else's introduction.
> > Yes it will, per `%default-guix-channel'.
> 
> Ok, technically true. And someone with the ability to make trusted
> commits upstream could modify that so that a fresh `guix pull` skips
> whatever malicious stuff they've done. But my solution doesn't make
> this situation any worse AFAICT.

As far as you can tell.  I think the scheme would be gnarly enough to
implement that an exploit à la [4] would be possible, even if by
accident rather than design.

This is actually a stronger argument against soft forks of Guix whether
or not your idea is implemented, but anyone using a "trusted" fork
could have their %default-guix-channel changed by someone who is not an
upstream committer.

And no, malicious trusted commits cannot be undone once done; unless
you allow downgrades, which come with their own caveats.  Thus, for a
fresh checkout, a committer who has had access to upstream and your
fork at some point in time – not necessarily the same time – can make
it so that they are the only committer whose patches get accepted on
both.  (A weaker attack is still possible if you only have access to
one repo.)

Toggle quote (4 lines)

> >

> By that logic, we would never have gotten the ability to specify

> multiple channels, since that isn't used in upstream Guix and doesn't

> need to be.

Except that there are legitimate free software channels like Guix
Science, Guix Past or rde's Guix Home channel, which, if memory serves,
use Guix authorizations without issue.  "I want to fork Guix and I need
to worsen its security to do so" is not a good selling point, just
sayin.

Toggle quote (1 lines)

Cheers

Tomas Volf wrote 11 hours ago

Recipients:(name . Saturanya Rahjane de Lasca)(address . saturanya@dismail.de)

Message-ID:87msfpqngp.fsf@wolfsden.cz

Saturanya Rahjane de Lasca <saturanya@dismail.de> writes:

Toggle quote (6 lines)

>> Again, not disputing that things work fine for people with commit

>> access. Perhaps that is part of why this issue hasn't been addressed

>> before :P

> You may call us privileged – and yes, we are – but that doesn't change

> the fact that weakening security weakens security.

Just to be explicit here, this whole thread was motivated by desire to

simplify soft-forks while keeping the same level of security. No one

here is arguing for weakening it.

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Tomas Volf wrote 11 hours ago

Recipients:(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)

Message-ID:87ikqdqnay.fsf@wolfsden.cz

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

Toggle quote (11 lines)>> All of these things discussed in this thread are technically
>> possible.  But I think that we all agree that the friction involved,
>> compared to just using my own fork with the patch applied, is much
>> larger, at least in my opinion.
> Yes, we can agree that this is your opinion.  We can even agree that
> there is more friction, but I'm not sure whether we agree on the value
> of "much".  But honestly speaking, the friction with contributing to
> upstream is much more a social one (too few people to review) than a
> technical one, and soft forks are a band aid that will likely burn you
> out even sooner the more you commit to them.

I do not know what the future holds, but introduction of my soft-fork is

form Fri Sep 15 16:18:25 2023 +0200, and I am of the opinion that it

saved me more work that is has cost me over the time. But yes, as you

stated, this is just my opinion.

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Nicolas Graves wrote 8 hours ago

Recipients:

Message-ID:87ed112jzp.fsf@ngraves.fr

On 2025-01-16 15:34, Liliana Marie Prikler wrote:

Toggle quote (37 lines)>> The complexity is due to the requirements of not bumping the channel
>> introduction (to avoid the increased attack surface from having to
>> keep obtaining the updated one, as I discussed earlier), keeping fork
>> history intact (to avoid force pulls), keeping upstream history
>> intact, and being able to authenticate both upstream and fork
>> commits. If you can think of a simpler method that meets these
>> requirements, I'd love to hear it.
> Guix committers are more than happy to use work trees and rebases,
> which simplify this a lot – again, it's as simple as pull,
> authenticate, rebase.
>
> W.r.t. keeping history intact, we had the following exchange on IRC
> yesterday.
>
> <Rutherther>	lfam: that's interesting that there is really a merge
> commit, for example if I remember right, the core updates merge few
> months ago happened by directly appending the commits instead of a
> merge commit
> <lfam>	Yes, there are two ways to do it (rebase and merge) and it's a
> matter of taste
> <lfam>	Of course there is a correct choice, as with all questions of
> taste ;)
> <Rutherther>	I personally prefer a merge commit, since it has two
> parents, you can track where the previous master pointed to
> <lfam>	And I prefer a rebase. But ultimately it's up to whoever is in
> front of the keyboard
> <lilyp>	FWIW, a rebase is cleaner, but requires that only one person
> signs off commits on any given branch (or else you're signing commits
> that someone else signed before and have to update the trailer… not
> ideal)
> <lfam>	It doesn't matter who signs the commits as long as they are
> authorized. That's the security model we have
>
> So yeah, even for (branch-)local work at least some committers prefer
> rebasing.
>

A lightly-related comment :

I recently started a guix extension to help manage complexities of

maintaining guix soft-forks. I haven't advertised it yet, and I'm fine

authenticating locally for now. I mainly focus on reproducibility of

patches sent (recording where patches are sent to be able to generate a

list of <origin> patches from a repo) and pulling from patched channels.

There still some work ahead before I can even advertise it properly, but

feel free to take a look! There's no doc yet.

https://git.sr.ht/~ngraves/guix-stack

Best regards,

Nicolas Graves

Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 75552@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 75552

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date