Seed & xPUB: Important nuances in wallet security

ForkLog

3 years ago

Seed & xPUB: Important nuances in wallet security

The essence of the difference between seed phrases and extended keys (xPUB) can be distilled to this: a seed is a superkey with read-and-write rights, while an xPUB is read-only. But in practice this does not answer many questions related to real wallet protection. In this space, Web3 entrepreneur Vladimir Menaskop shares with ForkLog readers his rich experience and observations in the field.

Introduction

On one hand, a great deal has been written about hierarchical key generation. No less has been written about the various types of addresses and the methods of generating them, and about where all this has come from.

However, my personal experience writing materials for the Menaskop & Synergis project is born not from abstract theory but from empirical questions raised within the DAO. This time I encountered the following request: “Is it technically possible, knowing two public addresses (for example, in MetaMask — two different ETH accounts), to determine that they were generated from a single seed phrase? (Not knowing the seed, simply determine whether they belong to the same seed).”

Short answer: no, in general this is not possible. But there are nuances. They will be explained below.

I should also note that there is this article about seed phrases: it systematically discusses general theses, examples, and several subtle points. So I will not repeat. ForkLog also has materials on the topic. But it’s time to write a number of additions, for there is more that interests us.

What to read before diving in?

Here are three Russian-language articles that should be studied in order, from top to bottom, before or at least after reading this material:

«Hierarchical key generation». This article covers the main abstractions necessary to understand the formation of seed phrases, xPUB, and wallets from them;
«All about the seed phrase». The dive here is not as deep, but everything is presented with a view to practical use;
«On retrieving a private key from a seed phrase». The author “does and understands.”

Now let us move to questions and answers.

Riddles in the dark

The aim of this article is to identify many points tied to one: the seed phrase. While every point need not connect to every other, linking to the main point can reveal something important that often escapes attention or is discussed within narrowly focused Q&As. So, the questions.

Foundational

Some theses from the discussion mentioned above:

“I don’t understand this: if I create a new wallet in MetaMask, it will have the same seed phrase. But if I enter the seed phrase into another app, there will be only the first wallet. How can I use a second wallet in other apps?” Quote is taken from this thread. About MetaMask — see below.
Then followed this thesis: “Second and additional wallets in MetaMask are not generated from the seed phrase and are not tied to it in any way. Therefore, if you enter it into another MetaMask, you gain access only to the wallet generated from this seed phrase.” This internet description is not entirely accurate. What exactly is the issue? Read below.
The question became timelier because Trezor was hacked again, having physical access. And Ledger CEO Pascal Gauthier even made a position, contradicting the essence of the crypto market, stating that a seed phrase may belong to (not) only the user. In other words, the discussion began not from idle thought, but from the need of body and soul. Or money. After that another high-profile hack—Atomic Wallet.
Finally, as you will see, this question has arisen in both Russian- and English-speaking communities more than once. And it has repeatedly given rise to misconceptions. It is time to dispel these myths.

What other difficulties arise?

If you want to push a discussion to the extreme or even the absurd, there are two paths: Reddit, if you want to discuss everything at once, and Stack Exchange, if you want to touch on more technical aspects. And here is what I managed to find in the sea of stories directly related to the seed phrase :

Since mnemonic phrases began to be widely used in 2013, it’s logical to assume that approaches have changed many times since then. Sometimes it’s not easy to reconstruct old tooling, but this is necessary. Example: “The words come from a mnemonic word list from around 2013, certainly not from BIP39. There are a total of 16 words.” There are dozens and even hundreds of such stories;
The next common question: How can I get the private key from the seed phrase and account number generated by the coin?: The answer is given in article #3 from the “What to read for immersion?”;
Or this interesting question: “Is there a multisignature scheme that don’t need xpub backups?” The answer is given at the link to the question, but what interests us is its framing. We will return to it in the section on wallet types.

As you can see, these are questions from different years, from different people and communities, and therefore one can speak of their universality, which is what we will base our discussion on.

Concepts and their content

Let us start with the basics:

Derivation, or more precisely the derivation path — a fragment of data that tells a Hierarchical Deterministic (HD) wallet how to derive a particular key in the key tree.
Deterministic wallet — a wallet in which all private keys used were derived from a single secret shared across all keys. Fundamentally this brings us to SEED. But first—another term: BIP. What is it?
BIP — Bitcoin Improvement Proposal — a proposal to improve Bitcoin’s code, formatted according to official rules. It is possible for any user to propose a BIP, but for it to be added to the code of the first cryptocurrency and activated, it must receive approval from developers and miners. The following BIPs are important for us:
- BIP32 (see below examples);
- BIP39: link №01, link №02, link №03 and link №04.
Another important term HMAC (sometimes expanded as hash-based message authentication code or keyed-hash message authentication code) — a mechanism for ensuring data integrity, guaranteeing that information transmitted or stored in an unreliable environment has not been altered by third parties.
Serialization — the process of translating an object into a byte sequence, from which it can be fully restored.
And finally, extended public key — a special key that effectively represents a group of public keys and, therefore, addresses. Anyone with access can see all addresses generated from it.

Let us visualise the basic scheme. It may look as follows:

Master seed -> master key
Private parent key — private child key
Private parent key -> public child key
Public parent key -> public child key
Public parent key X private child key

And now the question arises: what is the connection between seed, xPUB, and private key?

xPUB, yPUB, zPUB and so on

Here is what can be learned about them from this research:

xPUB. This is the name of an extended public key. xPUB is used on older wallets whose addresses start with 1. xPUB is created under the Bitcoin standard BIP32 and provides read-only access to the wallet. xPUB allows viewing all transactions, addresses, and balances of a specific wallet, but it does not allow spending the balance. To spend, a private key is required.
yPUB. The same as xPUB, but the letter “y” indicates the extended public key belongs to a wallet with the Bitcoin standard BIP-49, which details a addressing scheme compatible with older versions, other than SegWit. The yPUB key has an address type P2SH-P2WPKH.
zPUB. Has the same principle as yPUB, but the addressing scheme is not backward compatible. It corresponds to the P2WPKH address type, so zPUB is intended for wallets originally compatible with SegWit.

There is now a whole menagerie of different extended public keys: xPUB, yPUB, zPUB, tPUB, uPUB, vPUB. They are all extended public keys, just like their “older brothers” Ypub, Zpub, Upub and Vpub.

But we are not interested in the nPUB approach itself, but in its relation to the seed. Yes, that is the next step.

Seed vs. xPUB

The beauty of xPUB is clear: it allows a third-party service to generate such addresses on behalf of the user, which will be known to the service, but the private keys remain with the user.

But are there drawbacks to this approach? And do they exist at all when “from one, many are generated”? Let us consider them through attacks.

Brute-force training

The first thing one must do after collecting information about the system is a direct, or more precisely — blunt, brute-force search. For this you can look at these tools:

But a natural question arises immediately: “How long could this take?” That is, for a direct and blunt brute-force. Here are two examples:

First — from ForkLog: “A Bitcoin enthusiast picked a seed phrase from known words in half an hour”;
Second — from Habr: this piece outlines more precise brute-force parameters in a given case.

Conclusion is simple: if you know all words, or most of them (say, 12), and the dictionary and language are known, the time to crack is not merely short but catastrophically small. If the words are unknown, as is their order, even eternity will not suffice for guessing/brute-forcing.

There are also cases where wallet formation from the outset did not follow the rules: recall Profanity.

Public hype around such cases is inevitable, but mathematical laws could not be overturned. A long-form answer to the quantum computer hack question can be read here. The point is that hackers do not live by brute force alone.

Attack on address distribution

Note that such attacks were originally conducted on vanity addresses, but subsequently spread to other formats.

This attack type came to mind as soon as the first question was posed for this article.

Ledger’s guidance on this matter: Ledger: “Beware scammers who send a small amount of coins or NFTs to poison your transaction history in Ledger Live. They can masquerade as received transactions, NFTs or even plausible network fees. Always double-check the transaction details on the Ledger device to avoid sending funds to a fraudulent address by mistake. Be vigilant, as scammers’ addresses can be very similar to yours. Never copy and use addresses from your transaction history. A “poisoned” account can be used in normal operation. Poisoning an account is not a hack.”

If you’re interested in the problem, I recommend reading this discussion. In any case, remember one thing: any free dust that lands on your wallet is inherently toxic. But do not jump to conclusions: a poisoned address is not a panacea, but one of the ways for forensics in decentralised systems.

Other attacks

They stem from both hardware/software architecture and user errors and poorly concealed services. Since ForkLog has already covered analyses, I point to one of them. And finally, the most important question remains.

Matching wallets to a seed phrase: possible or not?

First, let us define: why is this needed? And, second, who needs it.

The answers are as follows:

To hackers, in order to break into unlucky users.
To ordinary users, to understand how a particular dapp works.

Let us begin with the second case, because thanks to MetaMask it has become quite widespread.

MM

In capable hands, MetaMask is a powerful and flexible tool. For those who want to explore its capabilities more deeply, I recommend studying this and this material.

But questions about MetaMask remain. Take the official documentation: “MetaMask will attempt to add your additional accounts where possible (assuming they were not imported), checking your previous accounts in ascending order (i.e. account 2, then account 3, etc.). Accounts are automatically re-added if they have a non-zero ETH balance. However this process ends when MetaMask encounters an account with 0 ETH—so the first account with 0 ETH (and all subsequent) will not be added.”

Because of this approach, there have been difficulties. Here is a live example: “MetaMask suddenly disappeared from the browser for unknown reasons after a restore. The problem is that two wallets were linked to it; the second was added there via ‘Create account’. On restore, the first appeared, but the second did not. So it seems to have just vanished.”

Or this: “I could not access my MetaMask account and used restore from seed phrase and password. A wallet opened, but it was a different account. I did not save the private key for that account. I contacted MetaMask support but have not yet received an answer. On the internet I have not seen such a problem. I do not have hidden accounts in the wallet because I had one wallet and one account. The seed phrase was definitely correct because when I entered it and the password during restore, the wallet account opened, but with a completely different account. I discovered this yesterday and have not used the new account.”

Therefore many people attempt to map addresses by any means. The problems here are two:

xPUB, in the described perspective, is the prerogative of the UTXO system;
to map wallets by any resemblance of hacking is possible only if you use them, not when you try to recover.

And here a vicious circle closes: what works against you does not work for you. Paradoxical? Yes, but it is a fact.

Therefore I strongly recommend using different linkages to avoid clutter in the form of hardware wallets, imported wallets, wallets created from seed phrases in MetaMask during recovery, as in daily work this does not hinder (at least for me):

Hardware wallets for primary cold storage;
Imported ones for tests and linking services;
Primary (from seed phrases) for tests and simple operations.

But the pairing with Trezor & Ledger and/or their equivalents deserves a separate explanation.

Seed and hardware

What is on the market?

First of all, if you want to study the hardware-wallet market, consult the table compiled by regular ForkLog podcast guest Alex Petrov. In the course of evolution, specific devices may change, but the evaluation criteria remain important in themselves:

Current data can be found on BitcoinTalk. But a further important question now arises.

What is known about hacks?

Whether we like it or not, the zero-security principle works without fail. It states: “Any system can be hacked”. The question is the price: if the cost of breaking the system (in broader terms — energy) exceeds the profit, the system is usually left alone. Unless we are talking about destructive attacks.

Therefore it does not matter whether you want to mine gold from tin (which is possible) or hack the Bitcoin network; you should evaluate everything through a simple formula: P(hack) > P(system).

From this, when people talk about hacking seed phrases or extracting them from hardware solutions, they forget to mention how much energy/money will be spent. We should not make that mistake.

With this in mind, let us evaluate two recent cases and their conceptual differences:

Physical breach of the Trezor T: however, this hack also affected older models, as it has long been known, and the remedy lies in appropriate protection via passphrase.
Compromise of the seed phrase in Ledger: in this story, I will say immediately, everything is well, especially the CTO’s explanation. Here the question of relevance is more ideological than architectural.

The difference between the potential compromise of a seed phrase in Ledger and the actual extraction in Trezor is that in the latter we treat it as an inevitable bug (or feature?) of a designed system, akin to how we accept the 51% attack in PoW. But in the former, the breach is effectively a social consensus violation that assumes no one but the user can directly influence their assets (not their value, but ownership and disposal).

That is why I have previously remained on the side of Trezor, although, due to the nature of my work, I use different solutions, and now the above point has pushed me toward even greater diversification of assets.

What else should hardware-wallet owners know?

There are no perfect systems. Here is an outline of the aspects of operation of TROPIC01 and related elements from the same Alex Petrov:

Today Satoshi Labs has advanced discussion of hardware security, but “this is essentially research work, supported by [several market participants], but it is far from a final product.”
Creating a truly secure chip is a big job, requiring solid practical experience in how to build such complex things and how to protect them.
The existence of things like RISC-V is not a panacea.
The present problems of Trezor lie in STM32F, which is cheap but not secure.

Hence I repeat: there are no perfect solutions. But you can still do something.

Security rules

The seed phrase requires both primary and secondary security measures. The difference between them is simple: primary measures have stood the test of time and form the minimum program, while secondary measures continually expand due to the evolution of Web3 services and markets.

Primary measures include:

Storing the seed on two to three alternative sources: usually a paper copy or a metal copy, decentralized storage among trusted people, and, for example, an encrypted a encrypted USB drive in a safe;
The application of digital hygiene rules, even if they seem excessive. A two-part ForkLog Community HUB article (first, second) published on ForkLog Community HUB;
The principle that if something is written from experience and blood, it is not worth rechecking. Seed phrases should never be disclosed to anyone, and the public key should not be either.

As for the Seed & xPUB relationship, I hope it has become clearer.