【发布时间】:2016-05-24 11:55:17
【问题描述】:
我在使用持久本地卷在 Marathon 中运行应用程序时遇到问题。遵循instructions,以角色和主体启动 Marathon,并创建一个具有持久卷的简单应用程序,它只是挂起挂起。似乎从站响应了有效的提议,但实际上无法启动应用程序。即使我使用调试选项编译并使用GLOG_v=2 直接打开日志记录,从站也不会记录任何有关该任务的内容。
此外,Marathon 似乎一直在滚动任务 ID,因为它无法启动,但我无法在任何地方看到原因。
奇怪的是,当我在没有持久卷的情况下运行时,应用程序开始运行。
Marathon 上的调试日志似乎没有显示任何有用的信息,但是我可能遗漏了一些东西。谁能给我任何关于问题可能是什么或在哪里寻找额外调试的指示?提前谢谢了 ???? .
以下是关于我的环境和调试信息的一些信息:
Slave:Ubuntu 14.04 运行 0.28 预构建并在 0.29 中测试,从源代码构建
Master:Mesos 0.28 在 CoreOS 上的 Docker Ubuntu 14.04 映像中运行
Marathon:1.1.1 在 CoreOS 上的 Docker Ubuntu 14.04 映像中运行
具有持久存储的应用
来自v2/apps/test/tasks Marathon 的应用信息
{
"app": {
"id": "/test",
"cmd": "while true; do sleep 10; done",
"args": null,
"user": null,
"env": {},
"instances": 1,
"cpus": 1,
"mem": 128,
"disk": 0,
"executor": "",
"constraints": [
[
"role",
"CLUSTER",
"persistent"
]
],
"uris": [],
"fetch": [],
"storeUrls": [],
"ports": [
10002
],
"portDefinitions": [
{
"port": 10002,
"protocol": "tcp",
"labels": {}
}
],
"requirePorts": false,
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600,
"container": {
"type": "MESOS",
"volumes": [
{
"containerPath": "test",
"mode": "RW",
"persistent": {
"size": 100
}
}
]
},
"healthChecks": [],
"readinessChecks": [],
"dependencies": [],
"upgradeStrategy": {
"minimumHealthCapacity": 0.5,
"maximumOverCapacity": 0
},
"labels": {},
"acceptedResourceRoles": null,
"ipAddress": null,
"version": "2016-05-19T11:31:54.861Z",
"residency": {
"relaunchEscalationTimeoutSeconds": 3600,
"taskLostBehavior": "WAIT_FOREVER"
},
"versionInfo": {
"lastScalingAt": "2016-05-19T11:31:54.861Z",
"lastConfigChangeAt": "2016-05-18T16:46:59.684Z"
},
"tasksStaged": 0,
"tasksRunning": 0,
"tasksHealthy": 0,
"tasksUnhealthy": 0,
"deployments": [
{
"id": "4f3779e5-a805-4b95-9065-f3cf9c90c8fe"
}
],
"tasks": [
{
"id": "test.4b7d4303-1dc2-11e6-a179-a2bd870b1e9c",
"slaveId": "9f7c6ed5-4bf5-475d-9311-05d21628604e-S17",
"host": "ip-10-0-90-61.eu-west-1.compute.internal",
"localVolumes": [
{
"containerPath": "test",
"persistenceId": "test#test#4b7d4302-1dc2-11e6-a179-a2bd870b1e9c"
}
],
"appId": "/test"
}
]
}
}
Marathon 中的应用信息:(似乎部署在旋转)
没有持久存储的应用
来自v2/apps/test2/tasks Marathon 的应用信息
{
"app": {
"id": "/test2",
"cmd": "while true; do sleep 10; done",
"args": null,
"user": null,
"env": {},
"instances": 1,
"cpus": 1,
"mem": 128,
"disk": 100,
"executor": "",
"constraints": [
[
"role",
"CLUSTER",
"persistent"
]
],
"uris": [],
"fetch": [],
"storeUrls": [],
"ports": [
10002
],
"portDefinitions": [
{
"port": 10002,
"protocol": "tcp",
"labels": {}
}
],
"requirePorts": false,
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600,
"container": null,
"healthChecks": [],
"readinessChecks": [],
"dependencies": [],
"upgradeStrategy": {
"minimumHealthCapacity": 0.5,
"maximumOverCapacity": 0
},
"labels": {},
"acceptedResourceRoles": null,
"ipAddress": null,
"version": "2016-05-19T13:44:01.831Z",
"residency": null,
"versionInfo": {
"lastScalingAt": "2016-05-19T13:44:01.831Z",
"lastConfigChangeAt": "2016-05-19T13:09:20.106Z"
},
"tasksStaged": 0,
"tasksRunning": 1,
"tasksHealthy": 0,
"tasksUnhealthy": 0,
"deployments": [],
"tasks": [
{
"id": "test2.bee624f1-1dc7-11e6-b98e-568f3f9dead8",
"slaveId": "9f7c6ed5-4bf5-475d-9311-05d21628604e-S18",
"host": "ip-10-0-90-61.eu-west-1.compute.internal",
"startedAt": "2016-05-19T13:44:02.190Z",
"stagedAt": "2016-05-19T13:44:02.023Z",
"ports": [
31926
],
"version": "2016-05-19T13:44:01.831Z",
"ipAddresses": [
{
"ipAddress": "10.0.90.61",
"protocol": "IPv4"
}
],
"appId": "/test2"
}
],
"lastTaskFailure": {
"appId": "/test2",
"host": "ip-10-0-90-61.eu-west-1.compute.internal",
"message": "Slave ip-10-0-90-61.eu-west-1.compute.internal removed: health check timed out",
"state": "TASK_LOST",
"taskId": "test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c",
"timestamp": "2016-05-19T13:15:24.155Z",
"version": "2016-05-19T13:09:20.106Z",
"slaveId": "9f7c6ed5-4bf5-475d-9311-05d21628604e-S17"
}
}
}
运行应用时的从属日志:
I0519 13:09:22.471876 12459 status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
I0519 13:09:22.471906 12459 status_update_manager.cpp:497] Creating StatusUpdate stream for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
I0519 13:09:22.472262 12459 status_update_manager.cpp:824] Checkpointing UPDATE for status update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
I0519 13:09:22.477686 12459 status_update_manager.cpp:374] Forwarding update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000 to the agent
I0519 13:09:22.477830 12453 process.cpp:2605] Resuming slave(1)@10.0.90.61:5051 at 2016-05-19 13:09:22.477814016+00:00
I0519 13:09:22.477967 12453 slave.cpp:3638] Forwarding the update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000 to master@10.0.82.230:5050
I0519 13:09:22.478185 12453 slave.cpp:3532] Status update manager successfully handled status update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
I0519 13:09:22.478229 12453 slave.cpp:3548] Sending acknowledgement for status update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000 to executor(1)@10.0.90.61:34262
I0519 13:09:22.488315 12460 pid.cpp:95] Attempting to parse 'master@10.0.82.230:5050' into a PID
I0519 13:09:22.488370 12460 process.cpp:646] Parsed message name 'mesos.internal.StatusUpdateAcknowledgementMessage' for slave(1)@10.0.90.61:5051 from master@10.0.82.230:5050
I0519 13:09:22.488452 12452 process.cpp:2605] Resuming slave(1)@10.0.90.61:5051 at 2016-05-19 13:09:22.488441856+00:00
I0519 13:09:22.488600 12458 process.cpp:2605] Resuming (14)@10.0.90.61:5051 at 2016-05-19 13:09:22.488590080+00:00
I0519 13:09:22.488632 12458 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
I0519 13:09:22.488726 12458 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_RUNNING (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
I0519 13:09:22.492985 12452 process.cpp:2605] Resuming slave(1)@10.0.90.61:5051 at 2016-05-19 13:09:22.492974080+00:00
I0519 13:09:22.493021 12452 slave.cpp:2629] Status update manager successfully handled status update acknowledgement (UUID: 36c1f0cb-2fcd-44b9-ab79-cef81c2094be) for task test2.e74fb439-1dc2-11e6-a179-a2bd870b1e9c of framework 1a6352a6-d690-41a2-967e-07342bba56d2-0000
【问题讨论】:
-
你能发布马拉松的日志吗?尤其是接受报价的部分。