Google Cloud Launcher：单节点文件服务器部署失败答案

【问题标题】：Google Cloud Launcher: Single node file server deployment failingGoogle Cloud Launcher：单节点文件服务器部署失败
【发布时间】：2017-04-28 21:40:21
【问题描述】：

我正在尝试按照此处的说明部署单节点文件服务器：https://cloud.google.com/solutions/using-tensorflow-jupyterhub-classrooms

当我按照说明进行操作时，实例似乎可以正常运行，但 NFS 似乎没有运行。当我尝试从另一个实例挂载时

sudo mount -t nfs jupyterhub-filer-vm:/jupyterhub /mnt

我明白了

mount.nfs: Connection timed out

当我从 Compute Engine UI (https://console.cloud.google.com/compute/instancesDetail/zones/us-east1-d/instances/jupyterhub-filer-vm) 检查文件管理器实例时，我看到了

Custom metadata

ADMIN_PASSWORD  xxx
ATTACHED_DISKS  jupyterhub-filer-vm-jupyterhub
C2D_STATUS  DEPLOYMENT_FAILED
ENABLE_NFS  enable:True
ENABLE_SMB  enable:False
FILE_SYSTEM xfs
STORAGE_POOL_NAME   jupyterhub

文档建议

gcloud compute ssh --ssh-flag=-L3000:localhost:3000 --project=workpop-dev --zone us-east1-d jupyterhub-filer-vm

然后在浏览器中访问localhost:3000 以访问性能仪表板。 ssh 命令将我连接到实例，但浏览器返回 ERR_EMPTY_RESPONSE 并且在 ssh 会话中我看到 channel 4: open failed: connect failed: Connection refused。

在 ssh 会话中，我尝试

$ ps -e | grep nfs

它什么也不返回。

$ cat /etc/exports

返回一个只包含 cmets 的默认文件。

所以我用$ sudo find / -name "jupyterhub*" 查找磁盘，但这没有返回任何内容。再四处寻找，我看到/opt/c2d/setup.log 末尾有以下几行：

VIRTUAL_IP =
+ readonly ZFS_KERNEL_CONFIG=/etc/modprobe.d/zfs.conf
+ ZFS_KERNEL_CONFIG=/etc/modprobe.d/zfs.conf
+ networks=(10.0.0.0/8 127.0.0.1)
+ readonly networks
+ readonly DISK_PREFIX=/dev/disk/by-id/google
+ DISK_PREFIX=/dev/disk/by-id/google
+ readonly DATA_DEVICE=/dev/disk/by-id/google-jupyterhub-filer-vm-data
+ DATA_DEVICE=/dev/disk/by-id/google-jupyterhub-filer-vm-data
+ [[ xfs = \z\f\s ]]
+ [[ -n '' ]]
+ case "${FILE_SYSTEM}" in
+ mkfs.xfs -L jupyterhub /dev/disk/by-id/google-jupyterhub-filer-vm-data
/dev/disk/by-id/google-jupyterhub-filer-vm-data: No such file or directory
Usage: mkfs.xfs
/* blocksize */         [-b log=n|size=num]
/* metadata */          [-m crc=0|1,finobt=0|1]
/* data subvol */       [-d agcount=n,agsize=n,file,name=xxx,size=num,
                            (sunit=value,swidth=value|su=num,sw=num|noalign),
                            sectlog=n|sectsize=num
/* force overwrite */   [-f]
/* inode size */        [-i log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,
                            projid32bit=0|1]
/* no discard */        [-K]
/* log subvol */        [-l agnum=n,internal,size=num,logdev=xxx,version=n
                            sunit=value|su=num,sectlog=n|sectsize=num,
                            lazy-count=0|1]
/* label */             [-L label (maximum 12 characters)]
/* naming */            [-n log=n|size=num,version=2|ci,ftype=0|1]
/* no-op info only */   [-N]
/* prototype file */    [-p fname]
/* quiet */             [-q]
/* realtime subvol */   [-r extsize=num,size=num,rtdev=xxx]
/* sectorsize */        [-s log=n|size=num]
/* version */           [-V]
                        devicename
<devicename> is required unless -d name=xxx is given.
<num> is xxx (bytes), xxxs (sectors), xxxb (fs blocks), xxxk (xxx KiB),
      xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or xxxp (xxx PiB).
<value> is xxx (512 byte blocks).

此时，我确信出了点问题，但我不知道如何解决它。有人可以帮忙吗？

【问题讨论】：

如果它对任何人有帮助，也会在github.com/GoogleCloudPlatform/gke-jupyter-classroom/issues/1进行跟踪

标签： google-cloud-platform nfs google-cloud-launcher

【解决方案1】：

磁盘名称有问题。

用默认值试试：Storage Name = data

（它为我完成了设置，没有错误，并且 localhost:3000 加载正确。我不确定它是否会在实验室后期产生错误。）

【讨论】：

仅供参考实验室文档已更新为“存储名称 = 数据”。自定义存储名称支持可能会在未来的更新中提供。