理论上,可以在 K8S(或任何其他)提供程序中引用来自 GCP 提供程序的资源,就像在单个提供程序的上下文中引用资源或数据源一样。
provider "google" {
region = "us-west1"
}
data "google_compute_zones" "available" {}
resource "google_container_cluster" "primary" {
name = "the-only-marcellus-wallace"
zone = "${data.google_compute_zones.available.names[0]}"
initial_node_count = 3
additional_zones = [
"${data.google_compute_zones.available.names[1]}"
]
master_auth {
username = "mr.yoda"
password = "adoy.rm"
}
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring"
]
}
}
provider "kubernetes" {
host = "https://${google_container_cluster.primary.endpoint}"
username = "${google_container_cluster.primary.master_auth.0.username}"
password = "${google_container_cluster.primary.master_auth.0.password}"
client_certificate = "${base64decode(google_container_cluster.primary.master_auth.0.client_certificate)}"
client_key = "${base64decode(google_container_cluster.primary.master_auth.0.client_key)}"
cluster_ca_certificate = "${base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)}"
}
resource "kubernetes_namespace" "n" {
metadata {
name = "blablah"
}
}
但实际上它可能无法按预期工作,因为一个已知的核心错误会破坏跨提供商的依赖关系,请分别参见 https://github.com/hashicorp/terraform/issues/12393 和 https://github.com/hashicorp/terraform/issues/4149。
替代解决方案是:
- 首先使用 2-staged apply 和 target GKE 集群,然后是其他任何依赖它的东西,即
terraform apply -target=google_container_cluster.primary,然后是 terraform apply
- 将 GKE 集群配置与 K8S 配置分开,为它们提供完全隔离的工作流,并通过 remote state 连接它们。
/terraform-gke/main.tf
terraform {
backend "gcs" {
bucket = "tf-state-prod"
prefix = "terraform/state"
}
}
provider "google" {
region = "us-west1"
}
data "google_compute_zones" "available" {}
resource "google_container_cluster" "primary" {
name = "the-only-marcellus-wallace"
zone = "${data.google_compute_zones.available.names[0]}"
initial_node_count = 3
additional_zones = [
"${data.google_compute_zones.available.names[1]}"
]
master_auth {
username = "mr.yoda"
password = "adoy.rm"
}
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring"
]
}
}
output "gke_host" {
value = "https://${google_container_cluster.primary.endpoint}"
}
output "gke_username" {
value = "${google_container_cluster.primary.master_auth.0.username}"
}
output "gke_password" {
value = "${google_container_cluster.primary.master_auth.0.password}"
}
output "gke_client_certificate" {
value = "${base64decode(google_container_cluster.primary.master_auth.0.client_certificate)}"
}
output "gke_client_key" {
value = "${base64decode(google_container_cluster.primary.master_auth.0.client_key)}"
}
output "gke_cluster_ca_certificate" {
value = "${base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)}"
}
在这里,我们通过outputs 公开所有必要的配置,并使用后端将状态以及这些输出存储在远程位置,在本例中为GCS。这使我们能够在下面的配置中引用它。
/terraform-k8s/main.tf
data "terraform_remote_state" "foo" {
backend = "gcs"
config {
bucket = "tf-state-prod"
prefix = "terraform/state"
}
}
provider "kubernetes" {
host = "https://${data.terraform_remote_state.foo.gke_host}"
username = "${data.terraform_remote_state.foo.gke_username}"
password = "${data.terraform_remote_state.foo.gke_password}"
client_certificate = "${base64decode(data.terraform_remote_state.foo.gke_client_certificate)}"
client_key = "${base64decode(data.terraform_remote_state.foo.gke_client_key)}"
cluster_ca_certificate = "${base64decode(data.terraform_remote_state.foo.gke_cluster_ca_certificate)}"
}
resource "kubernetes_namespace" "n" {
metadata {
name = "blablah"
}
}
这里可能不明显的是,必须在创建/更新任何 K8S 资源之前创建/更新集群(如果此类更新依赖于集群的更新)。
通常建议采用第二种方法(即使/如果错误不是一个因素并且跨供应商参考有效),因为它减少了爆炸半径并定义了更清晰的责任。此类部署通常由 1 个人/团队负责管理集群和不同的人/团队负责管理 K8S 资源,这是 (IMO) 常见的。
当然可能会有重叠 - 例如希望在新的 GKE 集群上部署日志记录和监控基础设施的运维人员,因此跨提供者依赖关系旨在满足此类用例。因此,我建议订阅上述 GH 问题。