【发布时间】:2017-07-28 23:40:09
【问题描述】:
所以我最近开始学习 Visual Basic 并测试解析 HTML 数据只是为了找点乐子。当我接触到一些 JSON 时,我下载了 newton-soft pack 并开始学习它是如何工作的。我一开始只是试图获取任何用户 Instagram 页面的 URL,但遇到了一个我似乎无法解决的错误,而且我是 VB 的新手,我认为最好寻求一些帮助而不是让我的头脑疲倦。
代码如下:
Imports HtmlAgilityPack
Imports Newtonsoft.Json
Module Module1
Sub Main()
Dim user As String = Console.ReadLine()
Dim html = "https://www.instagram.com/" + user
Console.WriteLine(html)
Dim web As New HtmlWeb()
Dim htmlDoc = web.Load(html)
For Each node As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//script[@type='text/javascript']")
If node.InnerHtml.Contains("profile_pic_url_hd") Then 'Makes sure the correct javascript code is used.
Dim json = node.InnerHtml.Substring(21, node.InnerHtml.Length - 21) 'Deletes the non Json code in the javascript.
Dim m As User = JsonConvert.DeserializeObject(Of User)(json) 'Error is here
Dim picture As String = m.profile_pic_url_hd
Console.WriteLine(picture)
Console.ReadLine()
Else
Console.WriteLine("Could not find correct code! Possibly because the username doesn't exist")
End If
Next
Console.WriteLine()
End Sub
Public Class User
Public Property biography As String
Public Property blocked_by_viewer As Boolean
Public Property country_block As Boolean
Public Property external_url As Object
Public Property external_url_linkshimmed As Object
Public Property followed_by As Integer
Public Property followed_by_viewer As Boolean
Public Property follows As Integer
Public Property follows_viewer As Boolean
Public Property full_name As String
Public Property has_blocked_viewer As Boolean
Public Property has_requested_viewer As Boolean
Public Property id As String
Public Property is_private As Boolean
Public Property is_verified As Boolean
Public Property profile_pic_url As String
Public Property profile_pic_url_hd As String
Public Property requested_by_viewer As Boolean
Public Property username As String
Public Property connected_fb_page As Object
Public Property media As Object
End Class
End Module
所以我在这一行得到错误:
Dim m As User = JsonConvert.DeserializeObject(Of User)(json)
说:Newtonsoft.Json.JsonReaderException: '完成读取 JSON 内容后遇到的附加文本:;.路径'',第 1 行,位置 3220。 位置编号总是变化的。但我不确定为什么会发生这种情况。
感谢任何帮助!
编辑: 每个人的 Instagram 帐户的 Json 都不同,但这里以国际足联的 Json 为例: https://pastebin.com/J3U0uz4S
【问题讨论】:
-
这将有助于发布 json -
As Object看起来也很可疑。这通常意味着 JSON 中没有表示的类型。
标签: json vb.net json.net instagram screen-scraping