Commit graph

56 commits

Author SHA1 Message Date
Kristoffer Dalby
50b62ddfb3
fix loading policy manager
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-26 18:19:14 -04:00
Kristoffer Dalby
8d5b04f3d3
hook up user and node changes to policy
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-26 12:53:04 -05:00
Kristoffer Dalby
6afb554e20
wrap policy in policy manager interface
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-26 11:42:14 -05:00
Kristoffer Dalby
50165ce9e1
resolve user identifier to stable ID
currently, the policy approach node to user matching
with a quite naive approach looking at the username
provided in the policy and matched it with the username
on the nodes. This worked ok as long as usernames were
unique and did not change.

As usernames are no longer guarenteed to be unique in
an OIDC environment we cant rely on this.

This changes the mechanism that matches the user string
(now user token) with nodes:

- first find all potential users by looking up:
  - database ID
  - provider ID (OIDC)
  - username/email

If more than one user is matching, then the query is
rejected, and zero matching nodes are returned.

When a single user is found, the node is matched against
the User database ID, which are also present on the actual
node.

This means that from this commit, users can use the following
to identify users in the policy:
- provider identity (iss + sub)
- username
- email
- database id

There are more changes coming to this, so it is not recommended
to start using any of these new abilities, with the exception
of email, which will not change since it includes an @.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-23 22:31:37 -05:00
Kristoffer Dalby
218138afee
Redo OIDC configuration (#2020)
expand user, add claims to user

This commit expands the user table with additional fields that
can be retrieved from OIDC providers (and other places) and
uses this data in various tailscale response objects if it is
available.

This is the beginning of implementing
https://docs.google.com/document/d/1X85PMxIaVWDF6T_UPji3OeeUqVBcGj_uHRM5CI-AwlY/edit
trying to make OIDC more coherant and maintainable in addition
to giving the user a better experience and integration with a
provider.

remove usernames in magic dns, normalisation of emails

this commit removes the option to have usernames as part of MagicDNS
domains and headscale will now align with Tailscale, where there is a
root domain, and the machine name.

In addition, the various normalisation functions for dns names has been
made lighter not caring about username and special character that wont
occur.

Email are no longer normalised as part of the policy processing.

untagle oidc and regcache, use typed cache

This commits stops reusing the registration cache for oidc
purposes and switches the cache to be types and not use any
allowing the removal of a bunch of casting.

try to make reauth/register branches clearer in oidc

Currently there was a function that did a bunch of stuff,
finding the machine key, trying to find the node, reauthing
the node, returning some status, and it was called validate
which was very confusing.

This commit tries to split this into what to do if the node
exists, if it needs to register etc.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-02 14:50:17 +02:00
enoperm
1e61084898
Add compatibility with only websocket-capable clients (#2132)
* handle control protocol through websocket

The necessary behaviour is already in place,
but the wasm build only issued GETs, and the handler was not invoked.

* get DERP-over-websocket working for wasm clients

* Prepare for testing builtin websocket-over-DERP

Still needs some way to assert that clients are connected through websockets,
rather than the TCP hijacking version of DERP.

* integration tests: properly differentiate between DERP transports

* do not touch unrelated code

* linter fixes

* integration testing: unexport common implementation of derp server scenario

* fixup! integration testing: unexport common implementation of derp server scenario

* dockertestutil/logs: remove unhelpful comment

* update changelog

---------

Co-authored-by: Csaba Sarkadi <sarkadicsa@tutanota.de>
2024-09-21 12:05:36 +02:00
Kristoffer Dalby
60b94b0467
Fix slow shutdown (#2113)
* rearrange shutdown

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* http closed is fine

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* logging while shutting

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-09 14:10:22 +02:00
nblock
bac7ea67f4
Simplify windows setup instructions (#2114)
* Simplify /windows to the bare minimum. Also remove the
  /windows/tailscale.reg endpoint as its generated file is no longer
  valid for current Tailscale versions.
* Update and simplify the windows documentation accordingly.
* Add a "Unattended mode" section to the troubleshooting section
  explaining how to enable "Unattended mode" in the via the Tailscale
  tray icon.
* Add infobox about /windows to the docs

Tested on Windows 10, 22H2 with Tailscale 1.72.0

Replaces: #1995
See: #2096
2024-09-09 13:18:16 +02:00
Kristoffer Dalby
2b5e52b08b
validate policy against nodes, error if not valid (#2089)
* validate policy against nodes, error if not valid

this commit aims to improve the feedback of "runtime" policy
errors which would only manifest when the rules are compiled to
filter rules with nodes.

this change will in;

file-based mode load the nodes from the db and try to compile the rules on
start up and return an error if they would not work as intended.

database-based mode prevent a new ACL being written to the database if
it does not compile with the current set of node.

Fixes #2073
Fixes #2044

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* ensure stderr can be used in err checks

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* test policy set validation

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add new integration test to ghaction

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add back defer for cli tst

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-08-30 16:58:29 +02:00
greizgh
8571513e3c
reformat code (#2019)
* reformat code

This is mostly an automated change with `make lint`.
I had to manually please golangci-lint in routes_test because of a short
variable name.

* fix start -> strategy which was wrongly corrected by linter
2024-07-22 08:56:00 +02:00
Kristoffer Dalby
ca47d6f353
small cleanups (#2017) 2024-07-19 09:21:14 +02:00
Kristoffer Dalby
7e62031444
replace ephemeral deletion logic (#2008)
* replace ephemeral deletion logic

this commit replaces the way we remove ephemeral nodes,
currently they are deleted in a loop and we look at last seen
time. This time is now only set when a node disconnects and
there was a bug (#2006) where nodes that had never disconnected
was deleted since they did not have a last seen.

The new logic will start an expiry timer when the node disconnects
and delete the node from the database when the timer is up.

If the node reconnects within the expiry, the timer is cancelled.

Fixes #2006

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use uint64 as authekyid and ptr helper in tests

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add test db helper

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add list ephemeral node func

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* schedule ephemeral nodes for removal on startup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix gorm query for postgres

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add godoc

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-07-18 10:01:59 +02:00
Pallab Pain
58bd38a609
feat: implements apis for managing headscale policy (#1792) 2024-07-18 07:38:25 +02:00
Kristoffer Dalby
c8ebbede54
Simplify map session management (#1931)
This PR removes the complicated session management introduced in https://github.com/juanfont/headscale/pull/1791 which kept track of the sessions in a map, in addition to the channel already kept track of in the notifier.

Instead of trying to close the mapsession, it will now be replaced by the new one and closed after so all new updates goes to the right place.

The map session serve function is also split into a streaming and a non-streaming version for better readability.

RemoveNode in the notifier will not remove a node if the channel is not matching the one that has been passed (e.g. it has been replaced with a new one).

A new tuning parameter has been added to added to set timeout before the notifier gives up to send an update to a node.

Add a keep alive resetter so we wait with sending keep alives if a node has just received an update.

In addition it adds a bunch of env debug flags that can be set:

- `HEADSCALE_DEBUG_HIGH_CARDINALITY_METRICS`: make certain metrics include per node.id, not recommended to use in prod. 
- `HEADSCALE_DEBUG_PROFILING_ENABLED`: activate tracing 
- `HEADSCALE_DEBUG_PROFILING_PATH`: where to store traces 
- `HEADSCALE_DEBUG_DUMP_CONFIG`: calls `spew.Dump` on the config object startup
- `HEADSCALE_DEBUG_DEADLOCK`: enable go-deadlock to dump goroutines if it looks like a deadlock has occured, enabled in integration tests.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-05-24 10:15:34 +02:00
Stefan Majer
8185a70dc7
Fix typos (#1860)
* Fix typos

* trigger GitHub actions

* remove kdiff3 orig files

* fix unicode

* remove unnecessary function call

* remove unnecessary comment

* remove unnecessary comment

---------

Co-authored-by: ohdearaugustin <ohdearaugustin@users.noreply.github.com>
2024-05-19 23:49:27 +02:00
Kristoffer Dalby
622aa82da2
ensure expire routines are cleaned up (#1924)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-05-02 15:57:53 +00:00
Kristoffer Dalby
a9c568c801
trace log and notifier shutdown (#1922)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-05-02 13:39:19 +02:00
Kristoffer Dalby
cb0b495ea9
batch updates in notifier (#1905) 2024-04-27 10:47:39 +02:00
Juan Font
bd047928f7
Move pprof to metrics router (#1902) 2024-04-21 22:08:59 +02:00
Kristoffer Dalby
ba614a5e6c
metrics, tuning in tests, db cleanups, fix concurrency issue (#1895) 2024-04-21 18:28:17 +02:00
Kristoffer Dalby
2ce23df45a
Migrate IP fields in database to dedicated columns (#1869) 2024-04-17 07:03:06 +02:00
Kristoffer Dalby
60f0cf908c more log.Error -> fmt.Errorf cleanup 2024-04-15 12:31:53 +02:00
Kristoffer Dalby
1704977e76 improve testing of route failover logic
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-04-15 12:31:53 +02:00
Kristoffer Dalby
58c94d2bd3 Rework map session
This commit restructures the map session in to a struct
holding the state of what is needed during its lifetime.

For streaming sessions, the event loop is structured a
bit differently not hammering the clients with updates
but rather batching them over a short, configurable time
which should significantly improve cpu usage, and potentially
flakyness.

The use of Patch updates has been dialed back a little as
it does not look like its a 100% ready for prime time. Nodes
are now updated with full changes, except for a few things
like online status.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-04-15 12:31:53 +02:00
Kristoffer Dalby
384ca03208
new IP allocator and add postgres to integration tests. (#1756) 2024-02-18 19:31:29 +01:00
Kristoffer Dalby
b60ee9db54
improve errors for missing directories (#1765)
* improve errors for missing directories

Fixes #1761
Updates #1760

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update container docs

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog with /var changes

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-02-17 13:36:19 +01:00
Kristoffer Dalby
c73e8476b9
make database configuration change breaking (#1766)
A lot of things are breaking in 0.23 so instead of having this
be a long process, just rip of the plaster.

Updates #1758

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-02-17 13:18:15 +01:00
Kristoffer Dalby
94b30abf56
Restructure database config (#1700) 2024-02-09 07:27:00 +01:00
Kristoffer Dalby
83769ba715
Replace database locks with transactions (#1701)
This commits removes the locks used to guard data integrity for the
database and replaces them with Transactions, turns out that SQL had
a way to deal with this all along.

This reduces the complexity we had with multiple locks that might stack
or recurse (database, nofitifer, mapper). All notifications and state
updates are now triggered _after_ a database change.


Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-02-08 17:28:19 +01:00
Alexander Halbarth
7e8bf4bfe5
Add Customization Options to DERP Map entry of integrated DERP server (#1565)
Co-authored-by: Alexander Halbarth <alexander.halbarth@alite.at>
Co-authored-by: Bela Lemle <bela.lemle@alite.at>
Co-authored-by: Kristoffer Dalby <kristoffer@dalby.cc>
2024-01-16 16:04:03 +01:00
Kristoffer Dalby
55ca078f22
embed (hidden) tailsql for debugging (#1663)
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2023-12-20 21:47:48 +01:00
Kristoffer Dalby
cf8ffea154
turn off grpc communication logging (#1640) 2023-12-10 15:22:59 +01:00
Kristoffer Dalby
f65f4eca35
ensure online status and route changes are propagated (#1564) 2023-12-09 18:09:24 +01:00
Kristoffer Dalby
a59aab2081
Remove support for non-noise clients (pre-1.32) (#1611) 2023-11-23 08:31:33 +01:00
Kristoffer Dalby
ed4e19996b
Use tailscale key types instead of strings (#1609)
* upgrade tailscale

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* make Node object use actualy tailscale key types

This commit changes the Node struct to have both a field for strings
to store the keys in the database and a dedicated Key for each type
of key.

The keys are populated and stored with Gorm hooks to ensure the data
is stored in the db.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use key types throughout the code

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* make sure machinekey is concistently used

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use machine key in auth url

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix web register

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use key type in notifier

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* fix relogin with webauth

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-11-19 22:37:04 +01:00
Kristoffer Dalby
c0fd06e3f5
remove the use key stripping and store the proper keys (#1603) 2023-11-16 17:55:29 +01:00
Juan Font
0030af3fa4
Rename Machine to Node (#1553) 2023-09-24 06:42:05 -05:00
Kristoffer Dalby
c957f893bd Return simple responses immediatly
This commit rearranges the poll handler to immediatly accept
updates and notify its peers and return, not travel down the
function for a bit. This reduces the DB calls and other
holdups that isnt necessary to send a "lite response", a
map response without peers, or accepting an endpoint update.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
591ff8d347 add pprof endpoint
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
432e975a7f move MapResponse peer logic into function and reuse
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
53a9e28faf Add missing return in shutdown
Co-Authored-By: Jason <armooo@armooo.net>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
4b65cf48d0 Split up MapResponse
This commits extends the mapper with functions for creating "delta"
MapResponses for different purposes (peer changed, peer removed, derp).

This wires up the new state management with a new StateUpdate struct
letting the poll worker know what kind of update to send to the
connected nodes.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
66ff1fcd40 Replace the timestamp based state system
This commit replaces the timestamp based state system with a new
one that has update channels directly to the connected nodes. It
will send an update to all listening clients via the polling
mechanism.

It introduces a new package notifier, which has a concurrency safe
manager for all our channels to the connected nodes.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-09-19 10:20:21 -05:00
Kristoffer Dalby
717abe89c1 remove "stripEmailDomain" argument
This commit makes a wrapper function round the normalisation requiring
"stripEmailDomain" which has to be passed in almost all functions of
headscale by loading it from Viper instead.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-21 10:31:48 +02:00
Kristoffer Dalby
c1218ad3c2 move reminder of dns funcs to util
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00
Kristoffer Dalby
d36336a572 fix lint
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00
Kristoffer Dalby
80ea87c032 move derp_server to derp server module
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00
Kristoffer Dalby
8c4c4c8633 move derp.go to derp module
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00
Kristoffer Dalby
2289a2acbf move Config definitions into types
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00
Kristoffer Dalby
725bbd7408 Remove variables and leftovers of pregenerated ACL content
Prior to the code reorg, we would generate rules from the Policy and
store it on the global object. Now we generate it on the fly for each node
and this commit cleans up the old variables to make sure we have no
unexpected side effects.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2023-06-08 16:34:15 +02:00