【发布时间】:2017-05-21 02:33:42
【问题描述】:
我刚开始搞乱图像处理,我遇到了一些非常奇怪的问题,或者至少我认为它们是。我假设我犯了一些非常愚蠢的错误。
我打算发布另一个关于此的问题,但是,使用下面的代码,有时我也会得到随机噪声,而不是用户绘制数字的像素表示。如果有人能告诉我为什么会发生这种情况,我将不胜感激。我很难找出原因,因为我阅读的所有内容都表明该代码应该可以工作。
如果有人需要更多信息,请告诉我!提前感谢您的帮助!
目标:
首先,获取用户在屏幕上绘制的数字。然后,将图像大小调整为 28 x 28。接下来,将图像转换为灰度,并获取归一化像素值的数组。最后,将归一化的灰度像素值输入机器学习算法。
[注意:在下图中,点代表 0 值,1 代表值 > 0。]
以下代码的输出运行良好。如果用户画一个“3”,我通常会得到如下内容:
问题:
如果我将 UnsafeMutablePointer 和 Buffer 的大小更改为 UInt8,我会得到看起来像随机噪声的东西。或者如果我用[UInt32](repeating: 0, count: totalBytes) 甚至[UInt8](repeating: 0, count: totalBytes) 替换 UnsafeMutablePointer 和 Buffer,每个像素最终都为 0,我真的不明白。
如果我将 UnsafeMutablePointer 和 Buffer 的大小更改为 UInt8,这是像素的输出:
获取灰度像素的代码:
public extension UIImage
{
private func grayScalePixels() -> UnsafeMutableBufferPointer<UInt32>?
{
guard let cgImage = self.cgImage else { return nil }
let bitsPerComponent = 8
let width = cgImage.width
let height = cgImage.height
let totalBytes = (width * height)
let colorSpace = CGColorSpaceCreateDeviceGray()
let data = UnsafeMutablePointer<UInt32>.allocate(capacity: totalBytes)
defer { data.deallocate(capacity: totalBytes) }
guard let imageContext = CGContext(data: data, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: width, space: colorSpace, bitmapInfo: 0) else { return nil }
imageContext.draw(cgImage, in: CGRect(origin: CGPoint.zero, size: CGSize(width: width, height: height)))
return UnsafeMutableBufferPointer<UInt32>(start: data, count: totalBytes)
}
public func normalizedGrayScalePixels() -> [CGFloat]?
{
guard let cgImage = self.cgImage else { return nil }
guard let pixels = self.grayScalePixels() else { return nil }
let width = cgImage.width
let height = cgImage.height
var result = [CGFloat]()
for y in 0..<height
{
for x in 0..<width
{
let index = ((width * y) + x)
let pixel = (CGFloat(pixels[index]) / 255.0)
result.append(pixel)
}
}
return result
}
}
绘制数字的代码:
func drawLineFrom(fromPoint: CGPoint, toPoint: CGPoint)
{
UIGraphicsBeginImageContextWithOptions(self.view.bounds.size, false, 1)
self.tempImageView.image?.draw(at: CGPoint.zero)
let context = UIGraphicsGetCurrentContext()
context?.move(to: fromPoint)
context?.addLine(to: toPoint)
context?.setLineCap(.round)
context?.setLineWidth(self.brushWidth)
context?.setStrokeColor(gray: 0, alpha: 1)
context?.strokePath()
self.tempImageView.image = UIGraphicsGetImageFromCurrentImageContext()
self.tempImageView.alpha = self.opacity
UIGraphicsEndImageContext()
}
override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?)
{
self.swiped = false
if let touch = touches.first {
self.lastPoint = touch.location(in: self.view)
}
}
override func touchesMoved(_ touches: Set<UITouch>, with event: UIEvent?)
{
self.swiped = true
if let touch = touches.first
{
let currentPoint = touch.location(in: self.view)
self.drawLineFrom(fromPoint: self.lastPoint, toPoint: currentPoint)
self.lastPoint = currentPoint
}
}
override func touchesEnded(_ touches: Set<UITouch>, with event: UIEvent?)
{
if !swiped {
self.drawLineFrom(fromPoint: self.lastPoint, toPoint: self.lastPoint)
}
self.predictionLabel.text = "Predication: \(self.predict())"
self.tempImageView.image = nil
}
预测数字的代码:
private func printNumber(rowSize: Int, inputs: Vector)
{
for (index, pixel) in inputs.enumerated()
{
if index % rowSize == 0 { print() }
if (pixel > 0) {
print("1", terminator: " ")
}
else { print(".", terminator: " ") }
}
print()
}
private func predict() -> Scalar
{
let resizedImaege = self.tempImageView.image!.resizedImage(CGSize(width: 28, height: 28), interpolationQuality: .high)
let inputs = resizedImaege!.normalizedGrayScalePixels()!.flatMap({ Scalar($0) })
self.feedforwardResult = self.neuralNetwork!.feedForward(inputs: inputs)
self.printNumber(rowSize: 28, inputs: inputs)
let max = self.feedforwardResult!.activations.last!.max()!
let prediction = self.feedforwardResult!.activations.last!.index(of: max)!
return Scalar(prediction)
}
【问题讨论】:
标签: ios swift core-graphics