问题描述
在使用Terraform时遇到了一个问题,他想知道为什么null_resource的remote-exec无法通过SSH连接到aws_instance。
以下是用户执行的.tf文件:
# 定义AWS提供者
provider "aws" {
# 凭据路径: ~/.aws/credentials
profile = var.aws_profile
}
# 定义密钥对
resource "aws_key_pair" "key_pair" {
# 变量的默认值: "id_rsa"
key_name = var.aws_key_pair_name
# 变量的默认值: "id_rsa"的公钥
public_key = var.aws_key_pair_public
}
# 定义安全组
resource "aws_security_group" "security_group" {
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
# 变量的默认值: "security-group-1"
Name = var.aws_security_group_tags_name
}
}
# 定义实例
resource "aws_instance" "instance" {
# 变量的默认值: Ubuntu AMI的ID
ami = var.aws_instance_ami
# 变量的默认值: "t2.micro"
instance_type = var.aws_instance_type
associate_public_ip_address = true
key_name = aws_key_pair.key_pair.key_name
vpc_security_group_ids = [aws_security_group.security_group.id]
tags = {
# 变量的默认值: "ec2-instance-1"
Name = var.aws_instance_tags_name
}
}
# 定义null_resource
resource "null_resource" "instance" {
provisioner "remote-exec" {
connection {
type = "ssh"
host = aws_instance.instance.public_ip
# 变量的默认值: "ubuntu",Ubuntu AMI的默认系统用户账户
user = var.aws_instance_user_name
# 变量的默认值: "~/.ssh/id_rsa"
# 提供给aws_key_pair.key_pair的公钥的路径
private_key = file(var.aws_key_pair_private_path)
timeout = "20s"
}
inline = ["echo 'remote-exec message'"]
}
provisioner "local-exec" {
command = "echo 'local-exec message'"
}
}
用户尝试将私钥文件的权限设置为400和600,并且在两种情况下都返回以下错误:
aws_instance.instance (remote-exec): Connecting to remote host via SSH...
aws_instance.instance (remote-exec): Host: 54.82.23.158
aws_instance.instance (remote-exec): User: ubuntu
aws_instance.instance (remote-exec): Password: false
aws_instance.instance (remote-exec): Private key: true
aws_instance.instance (remote-exec): Certificate: false
aws_instance.instance (remote-exec): SSH Agent: true
aws_instance.instance (remote-exec): Checking Host Key: false
aws_instance.instance (remote-exec): Target Platform: unix
aws_instance.instance: Still creating... [1m0s elapsed]
╷
│ Error: remote-exec provisioner error
│
│ with aws_instance.instance,
│ on main.tf line 63, in resource "aws_instance" "instance":
│ 63: provisioner "remote-exec" {
│
│ timeout - last error: SSH authentication failed (ubuntu@54.82.23.158:22): ssh: handshake failed: ssh: unable to
│ authenticate, attempted methods [none publickey], no supported methods remain
尽管以下命令成功连接到EC2实例:
ubuntu:~/projects/course-1/project-1$ ssh -i "id_rsa" ubuntu@ec2-54-163-199-195.compute-1.amazonaws.com
用户想知道自己错过了什么,是否有更好的方法解决这个问题。
解决方案
请注意以下操作注意版本差异及修改前做好备份。
方案1
你需要稍微修改remote-exec
的语法。首先,建立与远程服务器的连接,然后使用provisioner执行以下操作:
resource "null_resource" "instance" {
connection {
type = "ssh"
host = aws_instance.instance.public_ip
# 变量的默认值: "ubuntu",Ubuntu AMI的默认系统用户账户
user = var.aws_instance_user_name
# 变量的默认值: "~/.ssh/id_rsa"
# 提供给aws_key_pair.key_pair的公钥的路径
private_key = file(var.aws_key_pair_private_path)
timeout = "20s"
}
provisioner "remote-exec" {
inline = ["echo 'remote-exec message'"]
}
provisioner "local-exec" {
command = "echo 'local-exec message'"
}
}
这应该解决你的问题。谢谢!
方案2
使用脚本或工具来管理容器的启动顺序可能会增加复杂性,并且需要确保容器A和容器B之间的依赖关系正确设置。
另一种方法是编写脚本或使用工具来控制容器的运行顺序。你可以使用docker run
命令来手动控制容器的启动顺序,或者使用一些第三方工具来管理容器的依赖关系。
示例:
以下是一个简单的bash脚本示例,可以在容器A启动后启动容器B:
#!/bin/bash
# 启动容器A
docker run -d --name container_a your_image_a
# 等待容器A完全启动
while ! docker exec container_a echo "Container A is ready"; do
sleep 1
done
# 启动容器B
docker run -d --name container_b your_image_b
在这个示例中,我们首先使用docker run
命令启动容器A,并将其命名为container_a
。然后,使用一个循环来等待容器A完全启动(这里是通过在容器内运行echo
命令来测试)。一旦容器A就绪,我们再使用docker run
命令启动容器B,并将其命名为container_b
。
正文完